> Could you provide some insight into why you need this? Just so we have addl
> background, I'm interested to know the use case.
Sure, we're building a clustered application that will use zookeeper
as part of it. We need to manage ZK ourself. The cluster running the
app & ZK may change over time (nodes added or removed) and we need to
keep ZK itself in-sync with any changes. They won't be common, but we
can't shut the app down to make the changes, it needs to be
> Are you expecting all of the servers to change each time, or just
> incremental changes (add/remove a single server, vs say move the entire
> cluster from 3 hosts a/b/c to x/y/z)
I'd expect a small number of changes at any time - a few nodes being
added, a few nodes being removed. Most of the nodes will stay the
> Any chance you could use DNS for this? ie change the mapping for the
> hostname from a -> x ip? Since the server a will go down anyway, this would
> cause the client to reconnect to b/c (eventually when dns ttl expires the
> client would also potentially connect to x).
Well, there are a lot of issues with DNS (including security & cache)
so I'd prefer to avoid it. Also, the real issue is the # of servers
are changing, not just their IP.
Although we probably wouldn't use it, I do think it would be nice to
support a single hostname for the ZK cluster with one A records for
each member, and have the ZK client handle resolving that properly
each time it connects.
> You might also look at this patch, we never committed it but it might be
> interesting to you:
> The benefit is that you'd only have one place to make the change, esp given
> that clients might be down/unreachable when this change occurs. Clients
> would have to poll this service whenever they get disconnected from the
> ensemble. One drawback of this approach is that the HTTP now becomes a
> potential SPOF. (although I guess you could always fall back to something,
> or potentially have a list of HTTP hosts to do the lookup, etc...).
Well, that just handles distribution of the list (which isn't really
our problem), it doesn't help with restarting the ZK client when the
list changes - it only pulls the list once, so you still have to
completely shutdown and restart the ZK client.
> It does sound interesting, however once we add something like this it's hard
> to change given that we try very hard to maintain b/w compatibility. If you
> did the testing and were able to verify I don't see why we couldn't add it -
> as it's "optional" in the sense that it would only be called in the use case
> you describe. I would feel more confident if we had more concrete detail on
> how we intend to do 107 (a basic functional/design doc that at least reviews
> all the issues), and how this would fit in. But I don't see that should
> necessarily be a blocker (although others might feel differently).
Have you ever considered adding features like this via a protected
interface (i.e. the are useful but aren't fully standardized, so if a
client wants to use it they can sub-class ZK and make them public)?
The ability to dynamically modify the server list on the client side
seems like it would be required no matter what approach were taken to