[Christian Heimes] > How expensive and costly are savepoints? 6, maybe 6.2, depending on the units you're using <wink>. Seriously, how can such a question be answered? How expensive is math.log()?
> I wasn't able to find informations about it in the Zope docs. Savepoints are very new, and AFAIK nobody has done timing experiments on them. > Are they as expensive as sub transactions or are they just using some CPU > cycles? Savepoints are a generalization of subtransactions (and subtransactions are now implemented "on top of" savepoints), so if you think the cost of a subtransaction was 100, the cost of a savepoint will be somewhere around 100 too. Modified state has to be written to temp file(s) in either case, and in such a way that it can be forgotten later if desired. > I'm thinking about using savepoints in my migration code. The code is > migrating a possible large amount of objects (hundreds up the tenth of > thousands). I don't want the code to fail because the last object has an > unicode decode issue. This sounds like a good use for savepoints. > Code example: > > for ob in objs: > savepoint = transaction.savepoint() > try: > migrate(ob) > except ConflictError: > raise > except: > log() > savepoint.rollback() > > If savepoints are costly I would create a new savepoint every 10 or 50 > objets. If I were you, I'd just _try_ it, and fiddle as necessary until I was happy with the tradeoffs I saw on my real data. It's not possible to guess the outcome; e.g., if "a typical call" to migrate() takes 10 seconds for your objects, the time to make a savepoint will probably be relatively insignificant; if migrate() takes a nanosecond, the time to make a savepoint will be relatively huge. I tried this code: """ # ... # tedious setup code to open a database and hang `tree` off the # root object # ... start = now() for i in range(N): tree[i] = 2*i sv = transaction.savepoint() # "the savepoint line" transaction.commit() finish = now() print finish - start """ with and without the savepoint line, where `tree` was an OOBTree and `N` was 1000, and it took 10x longer with the savepoint line. This is probably close to a worst case, because `tree[i] = 2*i` most often modifies the same bucket it modified on the previous iteration, and taking a savepoint on each iteration therefore requires writing out the full state for each bucket many times (about 15 times each, in fact). Without the savepoint line, each bucket state is materialized to disk only once. If I change it to an IIBTree, the discrepancy is even larger (about a factor of 15), because IIBTrees tend to put many more (key, value) pairs in their buckets than OOBTrees do, so each bucket state gets written out many more times with the savepoint line (about 60 times each) than without. OTOH, if your idea of migrate() doesn't make changes to the same containers (or other persistent objects) across iterations, the discrepancy should get smaller, approaching a factor of 1.0 in the limit (if no two iterations modify the same persistent object). It's not possible to quantify that in advance without knowing everything about your objects, your containers, and all the details involved in what your migrate() does. Of course if this is a one-time migration, I wouldn't worry about expense at all -- for all I know, it took me longer to write this reply than it will take you to run the migration script <0.6 wink>. _______________________________________________ For more information about ZODB, see the ZODB Wiki: http://www.zope.org/Wikis/ZODB/ ZODB-Dev mailing list - ZODB-Dev@zope.org http://mail.zope.org/mailman/listinfo/zodb-dev