Dominic Williams wrote:
1/ If a node crashes or something else goes wrong, you leave behind
persistent nodes. Over time these will grow and grow, rather like the old
tmp folders used to fill with files under Windows

That's true. One either needs to use ephemerals or use persistent and have a "garbage collector" (implicit or explicit gc). In most cases it's preferable to use the ephemeral.

2/ Persistent nodes = nasty scalability *bottleneck* because you're actually
having to write to disk somewhere.

This is not actually how ZK works. All znodes regardless of persistent/ephemeral are written to disk persistently. Ephemeral nodes are tied to the session that created them. As long as the session is alive the ephemeral node is alive. Sessions themselves are persistently/reliably stored by the ZK cluster. This allows the shutdown of the entire cluster and restart it, all sessions/ephemerals will be maintained. Sessions can move from server to server (if say network connectivity to server A fails, or server A itself fails then the client will move to server B). The session and all ephemerals are maintained (well, as long as the client moves withing the expiration timeout value).

To avoid this I'm actually thinking of writing locking system where you work
out the existing chain not by enumerating sequential children, but by
looking at the contents of each temporary lock node to see what it is
waiting on. But... that's quite horrible. Was wondering whether there is
some technical reason why you ephemeral nodes can't have children??

There are a few cases to think about.

1) obviously ephemeral nodes can't have persistent children, this just doesn't make sense

2) ephemeral nodes have an owner - the session that created them. so it would also not make sense (in my mind at least) to have an ephemeral /foo with another ephemeral /foo/bar with a different owner.

3) so you are left with "ephemerals can be a child of an ephemeral with the same owner".

4) there are also issues of order. in particular what is the "deletion order" depth first or breadth first, etc...

I believe the answer so far has been "we don't do this because it's fairly complicated and we haven't seen any use cases that require it." In the cases I've seen so far there was either a misunderstanding of how zk worked, or a simpler way available.

Does that make sense? Thoughts?


Reply via email to