Mekaraj, Prashant wrote:
is a great resource. It's rare to see a open source project think so
much about practical enterprise deployment and this is much
There are a few more recommendations that I think would be useful to
add to the page.
Feel free to open JIRAs when you encounter problems, feature
suggestions, comments on docs, anything. If you submit patches as well
it's even better. ;-)
1. dataDir size: Since the dataDir stores snapshots and you recommend
storing at least 3 snapshots, I am thinking of using 3 times the size
of the heap allocated to the process as a guideline for how big the
dataDir drive should be.
It needs to be significantly larger than that. 3x would be a lower
bound, not an upper. Typically this is cleared by a cron script, so you
aren't guaranteed that only 3 snaps reside in the dir at any one time.
2. dataLogDir size: Since a new log file is started every time a
snapshot is taken, and using 3 snapshots as a recommendation, I am
thinking of using the same 3 times size of heap as a guideline.
You can end up with more than a single log per snapshot, so again this
is really a lower bound, not an upper.
We've been reticent to pin a number/calc just because it's hard to
calculate and can depend alot on the environment. Also given the size of
disks these days it hasn't been much of an issue, at least for us, and I
haven't heard much about it from others. It's a good point, I don't know
how one would approach the calc - the primary components of the
calculation are; 1) the frequency of writes to the ensemble, 2) heap
size as you suggest, 3) the frequency of "cleanup" of the datadir. There
are additional issues such as configuration parameters (changing the
defaults) that would also need to be factored in.
3. Persistence of data and log directories:
https://issues.apache.org/jira/browse/ZOOKEEPER-546 implies that
there are cases where all zk data is loaded from a different
configuration store. In such cases, even if I use a disk that is
cleaned regularly(on reboots or rebuilds), I would be fine.
Yes, as long as you don't "rebuild" a majority the servers at the same
Also - If a zk server were to be added to an existing ensemble- for
example when the machine reboots), if the data and datalog
directories are empty, it seems to me that the server would sync with
the leader and build its log and snapshots again, although there will
be a performance hit on the entire ensemble while this is taking
place. Is this correct ?
Minimal performance hit really. The leader is streaming the latest
snap/log to the new zk server. Not much cpu overhead, minimal IO
(sequential read of the file), hopefully your network isn't maxed out,
etc.... This is going on in parallel while the rest of the ensemble
continues to process requests (as long as quorum has been maintained of
NOTICE: If received in error, please destroy, and notify sender.
Sender does not intend to waive confidentiality or privilege. Use of
this email is prohibited when received in error. We may monitor and
store emails to the extent permitted by applicable law.