I've done some tests with ~600 clients creating 5 million znodes (size 100bytes iirc) and 25million watches. I was using 8gb of memory for this, however --- in this scenario it's critical that you tune the GC, in particular you need to turn on CMS and incremental GC options. Otw when the GC collects it will collect for long periods of time and all of your clients will then time out. Keep an eye on the max latency of your servers, that's usually the most obvious indication of GC hits (it will spike up).

You can use the latency tester from here to do the quick benchmarks Ben suggested:
also see: http://bit.ly/4ekN8G


On 07/15/2010 08:57 AM, Benjamin Reed wrote:
i think there is a wiki page on this, but for the short answer:

the number of znodes impact two things: memory footprint and recovery
time. there is a base overhead to znodes to store its path, pointers to
the data, pointers to the acl, etc. i believe that is around 100 bytes.
you cant just divide your memory by 100+1K (for data) though, because
the GC needs to be able to run and collect things and maintain a free
space. if you use 3/4 of your available memory, that would mean with 4G
you can store about three million znodes. when there is a crash and you
recover, servers may need to read this data back off the disk or over
the network. that means it will take about a minute to read 3G from the
disk and perhaps a bit more to read it over the network, so you will
need to adjust your initLimit accordingly.

of course this is all back-of-the-envelope. i would suggest doing some
quick benchmarks to test and make sure your results are in line with


On 07/15/2010 02:56 AM, Maarten Koopmans wrote:

I am mapping a filesystem to ZooKeeper, and use it for locking and
mapping a filesystem namespace to a flat data object space (like S3).
So assuming proper nesting and small ZooKeeper nodes (< 1KB), how many
nodes could a cluster with a few GBs of memory per instance
realistically hold totally?

Thanks, Maarten

Reply via email to