need ops documentation that details supervision of ZK server processes

                 Key: ZOOKEEPER-485
             Project: Zookeeper
          Issue Type: Bug
          Components: documentation, server
            Reporter: Patrick Hunt
             Fix For: 3.2.1, 3.3.0

We need ops documentation detailing what to do if the ZK server VM fails - by 
fail I mean the jvm process

In general a supervisor process should be used to start/stop/restart/etc... the 
ZK server vm.

Something like daemontools could be used, or 
more simply a wrapper script
should monitor the status of the pid and restart if the jvm fails. It's up to 
the operator, if this is not done
automatically then it will have to be done manually, by operator restarting the 
ZK server jvm

The inherent behavior of ZK wrt to failures - ie that it automatically recovers 
as long as quorum is maintained - 
fits into this nicely.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

Reply via email to