[Zope] Structure for large schema

David Pratt Mon, 17 Apr 2006 10:08:54 -0700

Hi. I had asked this question on zope3 about a week ago or so an had noresponses. I am hoping I can receive some general guidance on thisissue. I am trying to determine the best structure for storing a largeschema where some attributes are lists or dictionaries. There are about70 attributes in the schema and I am trying to choose a structure thatwill not necessarily have to hold a bunch of empty space. I have a baseschema of approximately 30 attributes and others that subclass from itwith the largest being about 70.

I had originally thought of RDF at the onset and built a datastore withrdflib using a relational database. I chose this option because whenZODB gets larger it takes plenty of RAM. Problem here is the number ofaccesses to gather a complete object. It is pretty efficient from astorage perspective since if an object does not have particularattributes, you are not storing them and all items in the store areunique. I was not duplicating one piece of data. But say you wanted topresent a page consisting of 20 items or do a search. Gathering this uptakes much time when you are hitting a disk how many times to gather upjust a single object so query times were unacceptable and loading rdffrom outside sources also took a very long time. The data store growsinto millions of records so you better have a pretty sweet rdb serverwith lots of RAM also.

I had dismissed a relational database on its own since the data does notlend to a row and I may want to add to the schema at some point intime which could mean some pretty ugly business this way. But then I sawthe vertical example in the examples folder of SQLAlchemy that can dosomething to create dynamic fields as necessary to potentially avoidthis kind of hassle.

The ZODB provides the flexibility and Generations could work well forfuture updates so this looks very good but how efficient is it if 15% ofthe attributes have data and 85% do not?

I have also been experimenting with hybrid pickle / rdb storage so thatthe attributes that will receive the most attention are stored as fieldsand the full record is stored as pickle that is unpickled for views anddata entry.

In any case. Thought I'd ask since I am concerned about the efficiencyof storage and speed of access both. If rdf access was fast then itwould be great but this had not been the case. I just thought there maybe some other ideas on this or someone could advise on the efficiency ofZODB when in some cases, uses will be selective about which attributesare important to them.


Regards,
David
_______________________________________________
Zope maillist  -  Zope@zope.org
http://mail.zope.org/mailman/listinfo/zope
**   No cross posts or HTML encoding!  **

(Related lists -http://mail.zope.org/mailman/listinfo/zope-announce

http://mail.zope.org/mailman/listinfo/zope-dev )

[Zope] Structure for large schema

Reply via email to