[Martin Gfeller, attacks a ZEO hang on Windows]
Man, that sucks! Looks like a Python bug to me, probably specific to
Windows, and (as you discovered) introduced in Python 2.4, but
deliberately not backported to 2.3. I opened a Python tracker item
with a minimal hanging test case:
> My code uses no threads, apart from those created by ZEO itself.
A thread is needed to provoke an import hang, but ZEO creates quite
enough threads all by itself. "This kind of thing" generally requires
that a thread get spawned as a _side effect_ of doing an import(*),
but offhand I don't recall any code in ZODB/ZEO that does that. If
that rings a bell for anyone else, that would be an approach to
worming around the bug in the ZODB/ZEO code.
It's curious that nobody else reported this before, and ZODB's test
suite certainly doesn't provoke it (that's been run under Python 2.4
since before 2.4a1).
(*) Python's internal import lock is reentrant wrt the thread currently
doing an import, but blocks other threads from doing an import.
So the usual fatal dance starts like so:
thread A does an import of module M
code in module M spawns another thread B, and starts B running
code in thread B tries to do an import, but is blocked because
thread A's import (of M) is still in progress
thread A continues the process of importing M, and M does
something that waits for thread B to accomplish something
There the whole shebang deadlocks: A is waiting for B to do something,
and the import of M can't complete until A makes progress. But B
is waiting for the import of M to complete.
Hmm. I suppose there's some way for asyncore running in its own
thread to fatally confuse the import lock too, but I haven't bumped
into that myself.
For more information about ZODB, see the ZODB Wiki:
ZODB-Dev mailing list - ZODB-Dev@zope.org