On Thu, Feb 4, 2010 at 2:20 PM, Yonik Seeley <yo...@lucidimagination.com>wrote:
> There's no way to "hand over" responsibility for an ephemeral znode, right?
> We have solr nodes create ephemeral znodes (name based on host and port).
> The ephemeral znode takes some time to remove of course, so what
> happens is that if I bounce a solr server (containing a zk client) the
> ephemeral node will still exist when the server comes back up.
This problem comes up with any system that has hysteresis and needs a single
point of control.
> What's the best way to handle this situation? Delete and re-create?
Watch it and re-create when it does disappear?
I think you need to handle the problem of multiple search nodes coming up on
the same machine, possibly because the old one may have hung up.
So... I would recommend
a) if the ephemeral still exists, wait for a few more seconds to see if it
b) if it goes away, create a new one and continue as normal
c) if it doesn't go away take additional action to determine if service is
still running (i.e. panic and run in circles).