Roché Compaan wrote at 2006-6-22 21:53 +0200:
>What overhead does undo add to performance?
Very few -- apart from a fast growing storage file.
However, the log behaviour of "FileStorage" means that
you get a very different notion of locallity.
In a relational database, records in the same table have
some chance to be near to one another. With a "FileStorage"
records modified in the same transaction are near to one another.
In general locality in a "FileStorage" is much smaller than
in a relational database. This means that the equivalent
of a "full table scan" would be much more inefficient.
>Can state be serialised more economically to reduce disk IO?
Sure: the ZODB uses a very bulky serialization format:
Each object contains the full path to its class
and the state is described in a self contained way
(explicitly naming all attributes and their value).
This gives you lots of redundancy (compared to a relational
system where the field structure is not carried in each row).
For your most frequent object types, you may work with slots
rather than dicts (this means that the class determines the fields,
not each individual object).
The newest pickle formats can also handle the class references
is bit more efficiently -- at least when a single transaction
modifies many objects of the same class.
>Is the ZODB really slow
For highly structured data, the ZODB is necessarily considerably slower than
a (well designed) relational database.
That's because the relational database makes use of the "highly structured"
property while the ZODB ignores it.
We have an additional reason why object oriented databases
tend to be considerable slower than relational ones:
With a standard relational database, the querying operations
are executed on the server -- near to the data.
Relatively few data travels from the server to the client.
Object oriented databases tend to have a stupid server --
one that knows only state but no behaviour.
Consequently, the server cannot do anything with the objects
it stores -- all operations must be done on the clients (which have
the behaviour). This means that the operations are not performed
near to the data and lots of data needs to travel from the server to
the client (often to be discarded there).
For more information about ZODB, see the ZODB Wiki:
ZODB-Dev mailing list - ZODB-Dev@zope.org