Thanks, I see Patrick has replied in the archives but I don't have it in my 
mail (yet). I'd probably use 2 EC2 High-mem instances (17GB/instance), and I 
have no watches at all, so I should be able to store between 5-10M data, but 
I'll test that over the summer. I'll post the results here (and will publish my 
simple sync, no-watch Scala client as well).

Best, Maarten

Op 15 jul 2010, om 17:57 heeft Benjamin Reed het volgende geschreven:

> i think there is a wiki page on this, but for the short answer:
> 
> the number of znodes impact two things: memory footprint and recovery time. 
> there is a base overhead to znodes to store its path, pointers to the data, 
> pointers to the acl, etc. i believe that is around 100 bytes. you cant just 
> divide your memory by 100+1K (for data) though, because the GC needs to be 
> able to run and collect things and maintain a free space. if you use 3/4 of 
> your available memory, that would mean with 4G you can store about three 
> million znodes. when there is a crash and you recover, servers may need to 
> read this data back off the disk or over the network. that means it will take 
> about a minute to read 3G from the disk and perhaps a bit more to read it 
> over the network, so you will need to adjust your initLimit accordingly.
> 
> of course this is all back-of-the-envelope. i would suggest doing some quick 
> benchmarks to test and make sure your results are in line with expectation.
> 
> ben
> 
> On 07/15/2010 02:56 AM, Maarten Koopmans wrote:
>> Hi,
>> 
>> I am mapping a filesystem to ZooKeeper, and use it for locking and mapping a 
>> filesystem namespace to a flat data object space (like S3). So assuming 
>> proper nesting and small ZooKeeper nodes (<  1KB), how many nodes could a 
>> cluster with a few GBs of memory per instance realistically hold totally?
>> 
>> Thanks, Maarten
> 
> 

Reply via email to