One potential problem with solution 2 is that a naive implementation may cause what we call a "herd effect": once there is a new message, zookeeper generates a large number of notifications and all these clients generate a request to receive the message. Depending on the requirements of your application, one way around it is to randomize accesses by having clients reading new messages "in batches". One possibility is the following. Whenever x new messages are available, a client reads all those new messages, and such a client sets a watch x messages ahead. To avoid synchronization, clients have to start at different points of the message sequence.

If I understand it correctly, you propose two mechanisms:

1- Have one single node, and modify the data of that znode;
2- Have a znode, say "/broadcast", and have clients creating a new child znode under "/broadcast" for every new message they want to broadcast.

In case 1), if you are proposing to overwrite the content of the znode, then you would need first to make sure that all receivers have already received the previous message. This doesn't seem a good solution to me because a client that wants to broadcast a message would have to wait until all others flag that they have received the previous message. Appending new messages to the current content of the znode doesn't seem a great solution either because clients would have to make sure that they are overwriting the correct version.

Solution 2) sounds better to me because it is wait-free: no client willing to broadcast a message contends with other clients. Now, as you say, you have to use the sequence flag to make sure that all messages are delivered in the same order (if you want total order). You may still want to have a mechanism for clients to flag which messages they have already delivered so that you remove messages that everyone has already delivered. If your application generates a large number of messages and you don't remove them, you might end up with an unnecessarily large number of znodes. Finally, I'm not sure why you want to use ephemeral znodes for messages. I think that ephemeral znodes may cause you trouble in this case. Suppose that a client broadcasts a message by creating an ephemeral znode and crashes before all other clients deliver the message. You may end up with two clients delivering different sequences of messages.


1. Group Messaging:
Is it possible to do group messaging with zookeeper 3.0.1 ?
If not, are there any plans to add group messaging (say w/ casual ordering
or even better total ordering) to future releases ?
What would be the best approach if I need to do group messaging using zookeeper right now, should I use a separate group messaging or messaging
broker service ?

You could just have a node named '/service/foo'

and then have all members in your group listen to this node. When you need
to push a messages you can write to that node.

Or you could use a directory with ephemeral sequence files, once per

