I disagree with the original post that this is a problem, even in EC2.
Having the persistent copy on disk is exactly what makes the rolling restart
work so well.
I think that the misunderstanding is that this on-disk image is critical to
cluster function. It is not critical because it is replicated to all
cluster members. This means that any member can disappear and a new
instance can replace it with no big cost other than the temporary load of
copying the current snapshot from some cluster member.
On Mon, Jul 6, 2009 at 11:33 AM, Mahadev Konar <maha...@yahoo-inc.com>wrote:
> In the documentation of zookeeper, I have read that
> > zookeeper saves snapshots of the in-memory data in the file system. Is
> > that needed for recovery? Logically, it would be much easier for me if
> > this is not the case.
> Yes, zookeeper keeps persistent state on disk. This is used for recovery
> correctness of zookeeper.