On our fairly busy site (http://cnx.org) we're seeing in the logs some
instances of an error in Shared.DC.ZRDB.DA::
Module None, line 97, in search_form_handler
- <FSPythonScript at /plone/search_form_handler used for
- Line 97
Module Products.RhaptosRepository.Repository, line 537, in
Module Products.RhaptosRepository.VersionFolder, line 456, in search
Module Products.RhaptosModuleStorage.ZSQLFile, line 44, in __call__
Module Shared.DC.ZRDB.DA, line 492, in __call__
- <ExtZSQLMethod at /plone/portal_moduledb/20071212233625.240206723892>
Module Shared.DC.ZRDB.DA, line 393, in _cached_result
KeyError: ("\nSELECT p.*\nFROM persons p\nwhere\n p.firstname ~*
req('href='::text)\n or\n p.surname ~* req('href='::text)\n or\n
p.fullname ~* req('href='::text)\n or \n p.personid ~*
('^'||req('href='::text)||'$')\n or\n p.email ~*
(req('href='::text)||'.*@')\n\n", 0, 'devrep')
This is trying to remove a key from the ZSQL cache to shrink it down to
size, but doesn't find the key. From Shared.DC.ZRDB.DA._cached_result:
# if the cache is too big, we purge entries from it
if len(cache) >= max_cache:
# We also hoover out any stale entries, as we're
# already doing cache minimisation.
# 'keys' is ordered, so we purge the oldest results
# until the cache is small enough and there are no
# stale entries in it
while keys and (len(keys) >= max_cache or keys < t):
del cache[q] # <===== line 393, with the error
It looks a lot like:
but we have that fix in our Zope 2.9.8:
Perhaps it is another high-load leak? I don't think it can be multiple
threads doing cleanup at the same time, unless maybe there's a
transaction commit in there somewhere I don't know about.
Or maybe I'm running into the problem described in the big comment at
# When a ZSQL method is handled by one ZPublisher thread twice in
# less time than it takes for time.time() to return a different
# value, the SQL generated is different, then this code will leak
# an entry in 'cache' for each time the ZSQL method generates
# different SQL until time.time() returns a different value.
# On Linux, you would need an extremely fast machine under extremely
# high load, making this extremely unlikely. On Windows, this is a
# little more likely, but still unlikely to be a problem.
# If it does become a problem, the values of the tcache mapping
# need to be turned into sets of cache keys rather than a single
# cache key.
Would it be unsafe to toss a try/except around the del cache[q] bit on
the theory that it's already deleted, so, hey, why fail? It'd be really
nice to keep this off of users, even with it if does cause a bit of a leak.
I'll probably be setting up some logging to try and characterize this
further, but anybody have any clues?
"Building Websites with Plone"
Zope-Dev maillist - Zope-Dev@zope.org
** No cross posts or HTML encoding! **
(Related lists -