[ZODB-Dev] ZEO and time.sleep

Benji York Wed, 28 Mar 2007 14:28:10 -0800

Last week I spent a very enjoyable day (no kidding) debugging a very,very slow cold-start situation (more than 15 minutes to return from thefirst request). When making the first request to the app (Zope 3based), the app server and storage server would show virtually no CPUutilization, and there would be about a megabit of network traffic (on agigabit link). There was no obvious bottleneck.

After liberal application of strace, tcpdump, wireshark (aka ethereal),and the Python profiler we discovered that while waiting for anoutstanding request for an object to load, ZEO calls athreading.Connection instance's wait method with a timeout. When givena timeout that method enters a wait loop with a time.sleep to sleep fora while and then see if the condition has been met.

We found that time.sleep on that box had a minimum granularity of 10ms(when passed a non-zero value), thus causing each object load to takeapproximately that long. As you can imagine, that somewhat slowed downthe retrieval of the several thousand objects required to satisfy theinitial request(s) (until the ZEO cache was sufficiently warm).

The fix? Short-term: bump the operating system's timer interrupt onthat box to 1000Hz from 100Hz, increasing time.sleep's granularity to1ms (this was on Linux, Window's time.sleep appears have a much higherresolution).

Long-term: Jim has found that the timeout call in the wait-for-resultcode can be avoided, side-stepping the call to time.sleep altogether.

--
Benji York
Senior Software Engineer
Zope Corporation
_______________________________________________
For more information about ZODB, see the ZODB Wiki:
http://www.zope.org/Wikis/ZODB/

ZODB-Dev mailing list  -  ZODB-Dev@zope.org
http://mail.zope.org/mailman/listinfo/zodb-dev

[ZODB-Dev] ZEO and time.sleep

Reply via email to