Re: [ZODB-Dev] Re: [Zope3-dev] Re: Community opinion about search+filter

Jim Fulton Mon, 26 Mar 2007 05:24:51 -0800


On Mar 25, 2007, at 5:27 PM, Martijn Faassen wrote:

Hey Jim,

Jim Fulton wrote:

On Mar 25, 2007, at 12:33 PM, Martijn Faassen wrote:

[snip]

I have the strong suspicion that modern relational databases arecurrently better able to scale at queries using LIMIT and ORDER BY
 than the Zope 3 catalog.
I had a similar suspicion.  I assigned the Python Labs team the task
of finding out through literature search the approaches used.  They
found that there were none other than the sorts of things I've
mentioned.


What about caching strategies? (as I sketched out in my last mail)

Obviously, it depends a lot on access patterns. I expect that thisis an area where picking the right strategy and suceeding is highlyapplication specific.

Take batching. Caching would potentially make getting multiplebatching go faster,. but to benefit, you'd have to increase theinternal batch size. For example, if the user visible batch size is20 and you wanted them to be able to get the second batch withoutsearching and sorting, you'd have to make your internal batch size40. That would increase the cost for the first batch by on the orderof log(2). I suspect that most people don't look at multiplebatches, so caching to support multiple batches could be asignificant loss, even leaving memory impact aside.

OTOH, we've used some highly application specific caching strategiesin some of our commercial applications to great success. These cacheswere implemented as specialized indexes, and I would argue thatindexes are really a form of caching.

This article about MySQL claims that MySQL is the only databasethat does query result set caching. Surprising for such an obviousthought:


Sounds like BS to me. :)

http://dev.mysql.com/tech-resources/articles/mysql-query-cache.html
Perhaps it doesn't work as well as one would think and that's whyother database engines rejected it. :)



I suspect it is a hard general strategy to get right.

Note that SQL methods support query caching and Zope's cachingframework is often used to cache various kinds of computations,including searches.

I cannot back this up as I haven't done measurements. Perhaps you
have done so?

We did a literature search.


That's useful, but doesn't tell us very much about how they compare in
practice.


Actually, it does.  But feel free to to dome performance tests.

Perhaps someone should do measurements and see how the two comparein a
sort/batch use case. It shouldn't be too hard to set up a relational
database-based sorted batch along with a ZODB/catalog based sortedbatch
and see how they both hold up.

Yup, although, to be meaningful, you need to look at large datasets. This raises the amount of effort required.

* Do you estimate the performance of the Zope 3 catalog to beequivalent to the performance of a modern relational database
system for queries that need to sort and batch their results?
I estimate that the same issues apply to both.
Theoretical algorithm scalability is one thing, and the same issues
apply to both. Practical scalability might vary widely.

OK, I give up. This argument just isn't worth my time any more. I'msorry I objected to the original point.


Jim

--
Jim Fulton                      mailto:[EMAIL PROTECTED]                Python 
Powered!
CTO                             (540) 361-1714                  
http://www.python.org
Zope Corporation        http://www.zope.com             http://www.zope.org



_______________________________________________
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  [email protected]
http://mail.zope.org/mailman/listinfo/zodb-dev

Re: [ZODB-Dev] Re: [Zope3-dev] Re: Community opinion about search+filter

Reply via email to