Vinod Johnson wrote:


I guess then I don't follow the leader election recipe. Is the following scenario possible in the leader election recipe:
1) Leader L is partitioned from the ensemble.
2) ZK servers expire its session.
3) Some other follower F now becomes a leader.
4) L and F form a split brain?

I had wrongly assumed that the session was like a lease in that it allowed the client and server to independently know that the session had expired by the use of the global clock. Wouldn't it make sense for the client lib to expire its local session handle and never reuse it?

Here's a good reason for each client to know it's session status (connected/disconnected/expired). Depending on the application, if L does not have a connected session to the ensemble it may need to be careful how it acts.

I'm trying to think though some cases...

In the case of passive leader the followers will look at zk and only send requests to the leader, so this seems fine (L no longer gets requests, it syncs to the ensemble at some point and finds it's session expired, it recovers as appropriate)

But depending on timing, couldn't the old leader still get a request from some follower who is lagging in terms of event receipt (or is disconnected - which brings up the question of dealing with disconnection at the follower)? Not sure how likely this is in practice ... but I can't say I'm comfortable with all the theoretical possibilities at this point. In this case, a disconnected leader could play it safe and not accept new requests.

Yes, this is definitely a possibility. It takes time for a session to expire. If your leader dies the followers will continue to send requests until the session expires and they are notified. If a follower is on a server that's lagging the notification may be delayed... etc... However these types of cases probably have to be handled anyway; say the follower can talk to zk ensemble but not to the leader because of some network issue.

In the case of an active leader, L continues to send commands (whatever) to the followers. However a new leader L' has since been elected and is also sending commands to the followers. In this case it seems like either a) L should not send commands if it's not sync'd to the ensemble (and holds the leader token) or b) followers should not accept commands from non-leader (only accept from the current leader). a) seems the right way to go; if L is disconnected it should stop sending commands to the followers, if it's resync'd in time it can

Seems to make sense in this particular case (I had some other cases in mind that I'm not so sure about though)

Feel free to discuss...

start sending commands again, otw it's session will expire, a new leader L' elected and it will start sending commands to followers, eventually L will resync and notice that it is no longer the leader (and do whatever it takes to recover).

> Wouldn't it make sense for the
> client lib to expire its local session handle and never reuse it?

I would think that depends on how expensive it is to change leaders. It would be trivial for the client to close it's session and start a new one each time it's notified of a disconnect from the ensemble.

Perhaps that's good enough. An alternative would be to wait for the timeout period.

Patrick

Reply via email to