Python2.4.2 Zope 2.8.4 ZODB/Zeo 2.3.4 Mysql 4.0 Dual Athalon processors Linux RH7.3
We have two recent instances in our production sites where Zope suddenly stops responding. It is not a new problem, but we've now been confronted with two clean examples and nothing to blame them on. The problem appears to be independent of load as both incidents were on lightly loaded machines. A check of the logs (Linux and Zope) shows nothing obviously amiss except that the trace log (the old -M log) shows a sudden increase in active requests from the typical 0 or 1 to 1300 or more. In this context an "active request' is total number of requests pending at the end of this request and is computed by post-processing. We front-end Zope with pound and make heavy use of MySQL. Both show a plethora of incomplete transactions. Examination of the raw trace log shows that Zope is continuing to accept requests, but nothing getting done. The raw log date-stamps four internal states for each transaction. The states are Begin (B), Input (I), action (A), and End (E). Inputs are gathered between B and I, outputs is made between A and E. The raw log shows B and I transactions, but apparently no processing is completing. I suspect that nothing is getting scheduled. I am at a loss as to where to begin to track this one down. The failure is spontaneous and apparently not triggered by any readily distinguishable inputs or pattern of inputs. The behavior smells a bit of resource limits or process synchronization problems, but there is not real evidence for either being the root cause. I am not sure what monitoring I should be doing to help locate the source of the problem. Has anyone seen seen a similar problem? Any advice as to how to proceed? -- _______________________________________________ Zope maillist - Zope@zope.org http://mail.zope.org/mailman/listinfo/zope ** No cross posts or HTML encoding! ** (Related lists - http://mail.zope.org/mailman/listinfo/zope-announce http://mail.zope.org/mailman/listinfo/zope-dev )