[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-485?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Patrick Hunt updated ZOOKEEPER-485:
-----------------------------------

    Attachment: ZOOKEEPER-485.patch

this patch details having a supervisory process (also fills out the monitoring 
section)

> need ops documentation that details supervision of ZK server processes
> ----------------------------------------------------------------------
>
>                 Key: ZOOKEEPER-485
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-485
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: documentation, server
>            Reporter: Patrick Hunt
>             Fix For: 3.3.0
>
>         Attachments: ZOOKEEPER-485.patch
>
>
> We need ops documentation detailing what to do if the ZK server VM fails - by 
> fail I mean the jvm process
> exits/dies/crashes/etc...
> In general a supervisor process should be used to start/stop/restart/etc... 
> the ZK server vm.
> Something like daemontools http://cr.yp.to/daemontools.html could be used, or 
> more simply a wrapper script
> should monitor the status of the pid and restart if the jvm fails. It's up to 
> the operator, if this is not done
> automatically then it will have to be done manually, by operator restarting 
> the ZK server jvm
> The inherent behavior of ZK wrt to failures - ie that it automatically 
> recovers as long as quorum is maintained - 
> fits into this nicely.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to