Adam Groszer wrote at 2006-6-23 14:27 +0200: > ... >Some fears they are having and I can't find unambiguous information: >- Is ZODB a good choice for this app?
It depends... At least careful design is necessary! The most problematic aspects of the ZODB are write conflicts. When two concurrent transactions modify the same object a write conflict will occur unless the object provides application specific conflict resolution (which might resolve some conflicts). This behaviour is especially serious when you have expensive (long running) transactions. The longer the transaction runs the higher is the risk that it interferes with another transaction and the higher is the cost of the resulting conflict. This means that you must carefully design your system to reduce the risk of write conflicts. You can (and should!) for example use workflow to prevent that the same application object (read "document" in your case) can be modified by concurrent transactions. You are interested in this for other reasons as well (you do not want to wipe out the work of a colleague by overwriting his changes). This can clear this side (application objects) of the front. However, there are also global objects which can be modified concurrently. The most prominent example: catalog data structures (speak "indexes"). They do use application specific conflict resolution -- but it is often not good enough... For the catalog, you can move out indexing operations to a separate thread (done e.g. by the "QueueCatalog" Zope product). This considerably reduces write conflicts at the cost that indexing operations are no longer inline but lack a bit behind. If you implement other global objects with high write probability, you need to either implement appliciation specific conflict resolution (which is not always easy) or carefully reduce the conflict probability (e.g. by relaying the changes to a separate thread). Other aspects you should care about: * how to make backups for your system "FileStorage" tends to produce a few huge files which are difficult to backup (and restore) with standard means (standard incremental backup will not work). There is "repozo" to get efficient (non-standard) incremental backup. * if possible, partition your data and put each partition in its own storage. Your partitions should be self contained (in order to move then around and backup/restore them individually) * (FileStorage) startup time can be proportional to the storage size (when there is not up to date index file, e.g. due to an abnormal shutdown) * (FileStorage) RAM linear to the number of objects is needed (to maintain the map "oid --> fileposition"). That is probably not yet a concern for a few 100.000 objects. It will get one when you get a few 100.000.000 objects... >Which storage to consider? >Filestorage? >maybe PGStorage? I think the only trustable storages are "FileStorage" and "DirectoryStorage" (requires file systems optimized for huge directories, e.g. ReiserFS). >- ACID properties. Is it really ACID, I mean data consistency level >could be compared to a RDB?> You know that ACID is not ACID -- and that even most relational databases are not truely ACID. The ZODB does not guarantee the "sequential" transaction isolation model: which means that the execution of any set of transactions is equivalent to some sequential execution of these transactions. A realistic example (which caused a bug in Zope's catalog indexes) where this fails looks like this: Transaction 1: deletes a document "d" from an index document list "index[term]" and deletes "index[term]" when the list becomes empty if len(index[term]) == 1: del index[term] else: del index[term] Transactin 2: adds a document "x" to an index document list "index[term]" and creates the list, if necessary. if term in index: index[term] = DocumentList() index[term].insert(x) If these transactions are executed concurrently, the "x" may get lost; for example if the document list contains just "d". In this case, transaction 1 will delete "index[term]" (because it does not yet see the effect of transaction 2). Transaction 2 will add the "x" to the old document list which is no longer used as soon as transaction 1 commits. Note that no sequential execution of T1 and T2 can have this result. Note that I had to work a bit to come up with the example above. The more straight forward implementation: Transaction 1: index[term].remove(d) if not index[term]: del index[term] would *not* have this problem -- because both transactions try to modify the same object ("index[term]") which is recognized and prevented by the ZODB (the former bug in Zope indexing implementation resulted from the application specific conflict resolution which wrongly claimed to have resolved the conflict). Note again that you can get similar effects with relational databases as well -- as truely sequential transaction isolation is very expensive and usually not implemented by default! For your bosses: the transaction isolation mode implemented by the ZODB is the "snapshot transaction isolation". >Here we do not have any experience and the >application should be a real good one. >- Coming from the RDB world, what are our current choices to provide a >referential integrity like service on top of ZODB? What do you mean by "referential integrity service"? Foreign keys and the corresponding consistency checks? You do not have something like this in stock ZODB but you can implement it on top of it >Currently we have schooltool.relations in sight. Something I do not know but might give you what you are looking for. -- Dieter _______________________________________________ For more information about ZODB, see the ZODB Wiki: http://www.zope.org/Wikis/ZODB/ ZODB-Dev mailing list - ZODB-Dev@zope.org http://mail.zope.org/mailman/listinfo/zodb-dev