Due to issues in my fingers and brain.

On Wed, Sep 8, 2010 at 1:20 PM, Vishal K <vishalm...@gmail.com> wrote:

> Thanks Ted.  Did you have to unwind the cluster due to data consistency
> issues or due to issues at the application?
>
> On Wed, Sep 8, 2010 at 4:06 PM, Ted Dunning <ted.dunn...@gmail.com> wrote:
>
> > I have used old snapshot files exactly once when I deleted a bunch of
> > server
> > state trying to unwind a tangled
> > cluster.
> >
> > I keep a few around just for backup purposes.
> >
> > On Wed, Sep 8, 2010 at 12:01 PM, Vishal K <vishalm...@gmail.com> wrote:
> >
> > > Hi All,
> > >
> > > Can you please share your experience regarding ZK snapshot retention
> and
> > > recovery policies?
> > >
> > > We have an application where we never need to rollback (i.e., revert
> back
> > > to
> > > a previous state by using old snapshots). Given this, I am trying to
> > > understand under what circumstances would we ever need to use old ZK
> > > snapshots. I understand a lot of these decisions depend on the
> > application
> > > and amount of redundancy used at every level (e.g,. RAID level where
> the
> > > snapshots are stored etc) in the product. To simplify the discussion, I
> > > would like to rule out any application characteristics and focus mainly
> > on
> > > data consistency.
> > >
> > > - Assuming that we have a 3 node cluster I am trying to figure out when
> > > would I really need to use old snapshot files. With 3 nodes we already
> > have
> > > at least 2 servers with consistent database. If I loose files on one of
> > the
> > > servers, I can use files from the other. In fact, ZK server join will
> > take
> > > care of this. I can remove files from a faulty node and reboot that
> node.
> > > The faulty node will sync with the leader.
> > >
> > > - The old files will be useful if the current snapshot and/or log files
> > are
> > > lost or corrupted on all 3 servers. If  the loss is due to a disaster
> > (case
> > > where we loose all 3 servers), one would have to keep the snapshots on
> > some
> > > external storage to recover. However, if the current snapshot file is
> > > corrupted on all 3 servers, then the most likely cause would be a bug
> in
> > > ZK.
> > > In which case, how can I trust the consistency of the old snapshots?
> > >
> > > - Given a set of snapshots and log files, how can I verify the
> > correctness
> > > of these files? Example, if one of the intermediate snapshot file is
> > > corrupt.
> > >
> > > - The Admin's guide says "Using older log and snapshot files, you can
> > look
> > > at the previous state of ZooKeeper servers and even restore that state.
> > The
> > > LogFormatter class allows an administrator to look at the transactions
> in
> > a
> > > log." * *Is there a tool that does this for the admin?  The
> LogFormatter
> > > only displays the transactions in the log file.
> > >
> > > - Has anyone ever had to play with the snapshot files in production?
> > >
> > > Thanks in advance.
> > >
> > > Regards,
> > > -Vishal
> > >
> >
>

Reply via email to