On Sat, Jan 19, 2013 at 10:00 AM, Jim Fulton <j...@zope.com> wrote: > - ZODB doesn't simply load your database into memory. > It loads objects when you try to access their state. > If you're using ZEO (or relstorage, or neo), each load requires a > round-trip to the server. That's typically a millisecond or two, > depending on your network setup. (Your database is small, so disk > access shouldn't be an issue as it is, presumably in your disk > cache.
I understand. It seems to be able to unghost about 10000 catalog-related objects/minute - does that sound about right? > - You say it often takes you a couple of minutes to handle requests. > This is obviously very long. It sounds like there's an issue > with the way you're using the catalog. It's not that hard get this > wrong. I suggest either hiring someone with experience in this > area to help you or consider using another tool, like solr. > (You could put more details of your application here, but I doubt > people will be willing to put in the time to really analyze it and > tell you how to fix it. I know I can't.) That's alright, I won't ask for such a time investment. As it is I greatly appreciate everyone for replying and helping out already - thanks guys! - solr is so fast it almost makes me want to cry. At ZC, we're > increasingly using solr instead of the catalog. As the original > author of the catalog, this makes me sad, but we just don't have the > time to put in the effort to equal solr/lucene. > - A common mistake when using ZODB is to use it like a relational > database, puting most data in catalog-like data structures and > querying to get most of your data. The strength of a OODB is that > you don't have to query to get data from a well-designed object > model. My use case is basically this: I have 400,000 'documents' with 17 attributes that I want to search on. One of them is the date of the document. This index I could easily do away with as the documents are organized roughly by date. However, if I want to get a 'document' made at any date but with a certain attribute in a certain range, I don't have a good way to do it based on how they are stored, now. I could try making my own indexing scheme but I figured ZCatalog would be well-suited for this... On Thu, Jan 17, 2013 at 12:31 PM, Claudiu Saftoiu <csaft...@gmail.com> > wrote: > ... > > One potential thing is this: after a zeopack the index database .fs file > is > > about 400 megabytes, so I figure a cache of 3000 megabytes should more > than > > cover it. Before a zeopack, though - I do one every 3 hours - the file > grows > > to 7.6 gigabytes. > > In scanning over this thread while writing my last message, I noticed > this. > > This is a ridiculous amount of churn. There is likely something > seriously out of whack with your application. Every application is > different, but we typically see *weekly* packs reduce database size by > at most 50%. > All that database contains is: a catalog with 17 indices of 400,000 objects, the root object, a document map, and an object to hold the catalog. The document map itself I put as a 'document_map' attribute of the catalog. Because of the nature of my app I have to add and re-index those objects quite often (they change a lot). This seems to cause the index .fs file to grow by a ridiculous amount... is there anything obviously wrong with the above picture? The main database does not have quite so much churn. Right after a pack just now, it was 5715MB, and it gets to at most 6000MB or so after 3 hours (often just up to 5800MB). I don't have to run the pack quite so often - is there a significant downside to packing often? > > Shouldn't the relevant objects - the entire set of latest > > versions of the objects - be the ones in the cache, thus it doesn't > matter > > that the .fs file is 7.6gb as the actual used bits of it are only 400mb > or > > so? > > Every object update invalidates cached versions of the obejct in all > caches except the writer's. (Even the writer's cached value is > invalidated of conflict-resolution was performed.) > > > Another question is, does zeopacking destroy the cache? > > No, but lots of writing does. > I see. After all the above it really sounds like if I want fast indexing I should just drop zcatalog and go ahead and use solr. It doesn't seem zcatalog + zodb, the way they are now, are really made to handle many objects with many indices that get updated often... Thanks for all the help, - Claudiu
_______________________________________________ For more information about ZODB, see http://zodb.org/ ZODB-Dev mailing list - ZODB-Dev@zope.org https://mail.zope.org/mailman/listinfo/zodb-dev