-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Nitro wrote: > Hello Tres, > > thanks for your detailed answers! > > Am 12.04.2010, 22:42 Uhr, schrieb Tres Seaver <tsea...@palladion.com>: > >>> Additionally I made some quick performance tests. I committed 1kb sized >>> objects and I can do about 40 transaction/s if one object is changed per >>> transaction. For 100kb objects it's also around 40 transactions/s. Only >>> for object sizes bigger than that the raw I/O throughput seems to start >>> to >>> matter. >> 40 tps sounds low: are you pushing blob content over the wire somehow? > > No, that test was with a plain file storage. Just a plain Persistent > object with a differently sized string and an integer attribute. I did > something like > > 1) create object with attribute x (integer) and y (variably sized string) > 2) for i in range(100): obj.x = i; transaction.commit() > 3) Measure time taken for step 2 > >>> Still don't know the answers to these: >>> >>> - Does it make sense to use ZODB in this scenario? My data is not suited >>> well for an RDBMS. >> YMMV. I still default to using ZODB for anything at all, unless the >> problem smells very strongly relational. > > Ok, the problem at hand certainly doesn't smell relational. It is more > about storing lots of different data than querying it extensively. It's a > mixture of digital asset management (the blobs are useful for this part) > and "projects" which reference the assets. The projects are shared between > the clients and will consist of a big tree with Persistent objects hooked > up to it.
I have seen the ZEO storage committing transactions at least an order of magnitude faster than that (e.g., when processing incoming newswire feeds). I would guess that there could have been some other latencies involved in your setup (e.g., that 0-100ms lag you mention below). >>> - Are there more complications to blobs other than a slightly different >>> backup procedure? >> You need to think about how the blob data is shared between ZEO clients >> (your appserver) and the ZEO storage server: opinions vary here, but I >> would prefer to have the blobs living in a writable shared filesystem, >> in order to avoid the necessity of fetching their data over ZEO on the >> individual clients which were not the one "pushing" the blob into the >> database. > > The zeo server and clients will be in different physical locations, so I'd > probably have to employ some shared filesystem which can deal with that. > Speaking of locations of server and clients, is it a problem - as in zeo > will perform very badly under these circumstances as it was not designed > for this - if they are not in the same location (typical latency 0-100ms)? That depends on the mix of reads and writes in your application. I have personnally witnessed a case where the clients stayed up and serving pages over a whole weekend in a clusterfsck where both the ZEO server and the monitoring infrastructure went belly up. This was for a large corporate intranet, in case that helps: the problem surfaced mid-morning on Monday when the employee in charge of updating the lunch menu for the week couldn't save the changes. >>> - Are there any performance penalties by using very large invalidation >>> queues (i.e. 300,000 objects) to reduce client cache verification time? >> At a minimum, RAM occupied by that queue might be better used elsewhere. >> I just don't use persistent caches, and tend to reboot appservers in >> rotation after the ZEO storage has been down for any significant period >> (almost never happens). > > In my case the clients might be down for a couple of days (typically 1 or > 2 days) and they should not spend 30 mins in cache verification time each > time they reconnect. So if these 300k objects take up 1k each, then they > occupy 300 MB of ram which I am fine with. If the client is disconnected for any period of time, it is far more likely that just dumping the cache and starting over fresh will be a win. The 'invalidation_queue' is primarily to support clients which remain up while the storage server is down or unreachable. >>> From what I've read it only seems to consume memory. >> Note that the ZEO storage server makes copies of that queue to avoid >> race conditions. > > Ok, I can see how copying and storing 300k objects is slow and can take up > excessive amounts of memory. Tres. - -- =================================================================== Tres Seaver +1 540-429-0999 tsea...@palladion.com Palladion Software "Excellence by Design" http://palladion.com -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.9 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iEYEARECAAYFAkvFBRMACgkQ+gerLs4ltQ6D2QCeNJujDxrJ0cGxkzPH4tMfcE+r t9IAoIj0J7f4DXGiNUdQ8nVXA4eAWQYT =7Dsq -----END PGP SIGNATURE----- _______________________________________________ For more information about ZODB, see the ZODB Wiki: http://www.zope.org/Wikis/ZODB/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev