[ZODB-Dev] Optimizing RelStorage Packing for large DBs

Jens W. Klein Fri, 15 Nov 2013 17:01:43 -0800

I started a new packing script for Relstorage (history free,postgresql). It is based on incoming reference counting. Why do we needthis? First a bit story telling (fast forward down to section "newscript" if you're bored)


The Problem
-----------

So we had a database with ~298.000.000 objects stored, grown over 4-5months. The beast was never packed. We import/sync data on a weeklybase: A complex Plone based product catalog optimized for web in 27languages each ~18.000 products, each with several variants (stored inannotated btrees). Lots of stuff is deleted and recreated on import. Sowe had a lot of garbage in DB.

We're using history free Relstorage. Relstorage performs much better onconcurrent writes and speeds up our import a lot. First on MySQL, now onPostgreSQL. Even if PostgreSQL does not solve all problems it seems tobe the stable choice. At least for one time we had a frozen server. Weblame MySQL, but well, its a CentOS and there are several other problemsinjecting side-effects (old unpatched kernel, etc, pp, ... we're dev notop, its enterprise level and decisions made in past are our constraints,carved in stone, more to say?)

Anyway: Our Plone with ZODB Relstorage performs very very well - withthis many objects stored including all the garbage!


BUT...we want to get rid of the garbage left on every import.

Relstorage deploys with a packaging script. Running it on the ~300million objects resulted in RAM-consumption >20GB with 300 daysestimated time of completion (we did not try if this is true). Phew.

So something had to be done. Storage usage grew about 15-20GB per week.As a quick (took some days) fix we decided to try classical Data.fspacking (zeopack).

We blocked content managers from editing and stopped imports and cloneda snapshot of the virtual maschine with the DB.

At this point we had roughly 160GB in Postgres.

On the clone we converted the Relstorage to a Filesystem Storage. Thistook 40 hours.

Storage shrinked to a 55GB in size Data.fs

Then we started the classical packing and after some hours the DB wasshrunken to 7.8 GB Data.fs still containing 44 million objects.Next we converted the Data.fs back to Relstorage (~6 hours) and switchedlive over to the packed DB.

Packing on the 44 million object relstorage with the default scriptshall take several days (3-4 days) only for the prepacking phase. Itconsumes less memory (~5GB) but is still insane.


New Script
----------

After analyzing the current (generic - mysql, pgsql, oracle - historyand history free) way to pack the DB I found no good way to optimize thecode here. Other were there before me.


Trial 1:

So I tried to rethink the way packing is done and tried to follow thecopy-and-switch method the filestorage uses: Copy all in-use zoids fromobject_state to a new table (traverse the graph starting at zoid 0),drop the old object_state, rename the new table to object_state. Thecopy process can be distributed. Several workers can do it in paralleluntil the database is at it max transactions/second.

I had this working. 4-6 workers running, a queue with zoids to processin DB, a master controlling the process and so on. But even with thisapproach the total time for a pack of the whole DB is after SQLoptimizations(!) roughly 8 days for a 44 million object graph.


OK - Back to the drawing board.

Trial 2:

On thursday evening I had a good conservation (and some beer) with abuddy from the local linux user group and we discussed the problem. Theoutcome was to try it with reference counting.


The idea is simple:

- iterate over all transactions starting with the lowest
  transaction id (tid)
- for each transaction load the object states connected with tid
- for each state fetch its outgoing references and fill a table where
  all incoming references of an object are stored as an array.
  if an state has no references write it anyway to the table with empty
  outgoing references

After reaching the highest tid its easy: all entries with no incomingreferences are garbage. Delete them and remove all incoming referencesof this entry from other entries.

When new transactions are happening its enough to only processdelta-transactions (newer than the last processed in the referencecounting table). This is pretty fast if the delta small, i.e. one dayonly. On the deltas another check need to detect references gonemeanwhile, but since the delta isnt that big this is probably not a problem.

I implemented this roughly now. Building the incoming reference countingtable from the scratch takes less than 7 hours. Deleting itself is notimplemented yet, but this is not that difficult :-)


Code
----

The current code is at:
https://github.com/bluedynamics/relstorage_packer

Did I miss something? Any opinions much appreciated!

Expect updates in this thread :)

Jens Klein
--
Klein & Partner KG, member of BlueDynamics Alliance

_______________________________________________
For more information about ZODB, see http://zodb.org/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
https://mail.zope.org/mailman/listinfo/zodb-dev

[ZODB-Dev] Optimizing RelStorage Packing for large DBs

Reply via email to