On Wed, Oct 13, 2010 at 10:28 AM, Chris Withers <ch...@simplistix.co.uk> wrote:
> On 13/10/2010 15:23, Jim Fulton wrote:
>>> Their occurences coincided with an app server cluster of 4 clients
>>> up completely...
>> Did the numbers get much bigger than 3?
>>> I would dearly love to know why that happened.
>> Typically, it's due to a client that votes and fails to call finish.
> Wouldn't I see storage errors in that case?
No. Unless you set a transaction timeout, the storage will
wait for the finish indefinitely.
> The only errors I saw were these and quite a few:
> line 377, in stub
> raise ValueError("Timeout waiting for protocol handshake")
> ValueError: Timeout waiting for protocol handshake
These were on the client, not the server.
This suggests that there's something else going on with your storage
server and that the transactions waiting messages are a red herring.
>>> Any way to interrogate a
>>> running zeo server to find out what it thinks it's up to?
>> You can connect to the monitor port in 3.9 and earlier,
> When that's configured, what information does it provide and how do I get
> it? (if there are docs, lemme know and I'll go read them instead)
Go read the code or try it.
>> You almost certainly want to set transaction-timeout in
>> your server configuration. This will cause transactions that take too
>> long to be terminated. We use a transaction timeout of 300 seconds.
> 5 mins is pretty high, though, right? Surely if all clients end up hanging
> for 230 seconds, they'll all be dropped by a load balancer?
That depends entirely on your application.
> Related: how can I find out how long transactions are taking?
Note that we're really talking about how long commits are taking,
specifically the time between vote and finish. You can determine that
from the waiting messages.
For more information about ZODB, see the ZODB Wiki:
ZODB-Dev mailing list - ZODB-Dev@zope.org