Sorry for the length of this one... but I'm trying to braindump to give you as much info about the problem as possible.
To be sure it doesn't get lost in my below ramblings, there is probably important peice of information I haven't mentioned yet... that is that these errors seem to coincide with the session data timeout setting [1]. I don't get the errors at all until the timeout is reached or has passed. [1] The timeout setting I'm refering to is denoted by the label: "Data object timeout value in minutes" on the /temp_folder/session_data object. Chris McDonough wrote: > OK, thanks John. Let's try one more thing... currently the mounted > database used to store the session data uses a connection that ignores > read conflicts. This is known to be bad because the machinery which > deals with keeping the sessioning index data will also ignore read > conflicts, which may create inconcstencies between two data structures > (BTrees) that need to be kept in sync. I tried this and it seemed to help some. I haven't seen the get() error we've been dicussing yet, but a the load() error just occurred (line 94 in TemporaryStorage - this was error #1 in my original email). Though the traceback is a bit different from my original email, as the LowConflictConnection isn't being used. Here's the new Traceback: Error Type: KeyError Error Value: [non-ascii chars] Traceback (innermost last): * Module ZPublisher.Publish, line 98, in publish * Module ZPublisher.mapply, line 88, in mapply * Module ZPublisher.Publish, line 39, in call_object * Module Products.DotOrg.Pages.KPage, line 110, in testSession * Module Products.DotOrg.Utils.Spawn, line 42, in launchProcess * Module Products.DotOrg.Utils.Spawn, line 73, in storeArgs * Module Products.Sessions.SessionDataManager, line 180, in * _getSessionDataObject * Module Products.Transience.Transience, line 175, in new_or_existing * Module Products.Transience.Transience, line 797, in get * Module Products.Transience.Transience, line 546, in _getCurrentBucket * Module ZODB.Connection, line 509, in setstate * Module Products.TemporaryFolder.TemporaryStorage, line 94, in load > Here's a patch to lib/python/Products/TemporaryFolder/TemporaryFolder.py > that reenables read conflict generation on the database. > > Index: TemporaryFolder.py > =================================================================== > RCS file: > /cvs-repository/Zope/lib/python/Products/TemporaryFolder/TemporaryFolder.py,v > retrieving revision 1.7 > diff -r1.7 TemporaryFolder.py > 72c72 > < db.klass = LowConflictConnection > --- > > #db.klass = LowConflictConnection > > You may see many more conflicts with this running. But maybe the data > structures will not become desynchronized. You weren't kidding about the increase in conflict errors. > Another problem, still unexplained, experienced by Andrew Athan, is that > if a reference is made to a session data object from within the standard > error message, somehow things get screwy under high load. If you're > doing the same, please let me know. Before this started happening there was a hasSessionData check getting called during standard error publishing, though we removed that early this week when this started happening. --- It might help you to better understand what might be causing the problem if you know where we're using sessions and how we can force this problem to occur. Not sure if this willl be of much help, but thought it couldn't hurt. We use sessions primarily as a sort of authenticated user marker. It just stored their username and a state field that get used in non-authenticated sections of our site to detect the user as having logged into the site (we can then raise an unautorized error to get the basic auth info for that user). Anyways, these calls happen on our basic Content class (subclassed from DTMLMethod) in its __call__() method. We use it a couple other places for small things, but this one sees the most use. I've figured out how to force these errors to happen to some extent. I've written a method that starts up a thread, which uses Client.call to call another method, which then basically just loops endlessly calling hasSessionData and getSessionData, incrementing a number in the session data and sleeping for a N number of seconds between loops. One of these guys will run forever without a problem. Once you start a second thread ReadConflictErrors start getting raised. Which thread gets the conflict and which one keeps working seems variable (probably just a timing thing). If I start enough of these threads I can cause the error to happen. But only once the session timeout is reached. Note that to help speed up getting the errors I either set the session time to 1 minute via _setTimeout() call or even manually tweak the appropriate session data managers attributes (_timeout_secs, _period and _timeout_slices) to very small values (ie. a few seconds). -- John Eikenberry [EMAIL PROTECTED] ______________________________________________________________ "A society that will trade a little liberty for a little order will deserve neither and lose both." --B. Franklin _______________________________________________ Zope-Dev maillist - [EMAIL PROTECTED] http://mail.zope.org/mailman/listinfo/zope-dev ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope )