I disagree with the original post that this is a problem, even in EC2.
Having the persistent copy on disk is exactly what makes the rolling restart
work so well.

I think that the misunderstanding is that this on-disk image is critical to
cluster function.  It is not critical because it is replicated to all
cluster members.  This means that any member can disappear and a new
instance can replace it with no big cost other than the temporary load of
copying the current snapshot from some cluster member.

>  In the documentation of zookeeper, I have read that
> > zookeeper saves snapshots of the in-memory data in the file system. Is
> > that needed for recovery? Logically, it would be much easier for me if
> > this is not the case.
> Yes, zookeeper keeps persistent state on disk. This is used for recovery
> and
> correctness of zookeeper.

