with our new ZEO client systems, we frequently observe this problem:
a ZEO process starts to use 100% CPU time (user) without a significant
increase of requests. Sometimes (but not always) the process stops answering
requests, still using 100% CPU.
When we kill such a process, it changes to zombie state (shown in top as 'Z'
and '<defunct>'), still using 100% CPU, but now its system time, not user. The
HTTP port is still in use, so we have to reboot this node to restart the ZEO
client. This usually fails because some filesystems cannot be unmouted, there
are still files locked.
I tried both start modes, runzope and zopectl, but no difference.
All that is in opposition of what I know about zombie processes, they should
use no CPU time.
RedHat RHEL4, Kernel 2.6.9-42.0.10.ELsmp, with address extension (16 GB RAM)
The older cluster nodes work perfectly, no such zombie problem ever (connected
to the same storage server); they run on Debian Sarge, Kernel 2.4.27 SMP
Any hint is appreciated
Zope maillist - Zope@zope.org
** No cross posts or HTML encoding! **
(Related lists -