On 06/25/2010 02:47 PM, Alexis Midon wrote:
1. Session events i.e. Type-None events are sent to all outstanding
watch handlers. So if you do get(path, watcherX), both the default
listener and watcherX will receive the session events.


That's true. This enables the watcher to handle the case (for example) when the client has become disconnected from the cluster. Per operation watchers was specifically added to support the "zk library" case - where more than a single consumer would be using the client connection. Makes it alot easier to add libraries dependent on zk.

  2. Watchers are one-time triggers, however session events do NOT
remove a watcher.
  In other words, if we're listening for NodeCreated event and a
disconnection occurs, we will eventually get notify of a Disconnected,
then a SyncConnected and finally a NodeCreated without having to set any
new watcher.

Correct.

  3. If the invocation of a (synchronous or asynchronous) method fails,
the watcher is not set. For instance if getChildren("/foo", mywatcher)
fails because the client is disconnected, mywatcher won't be notified of
futur events.

Correct, a watch is only valid if the operation was successful.


I apologize in advance if I'm stating the obvious but the differences
between "path" events and "session" events were not clear to me.


No, this is great. Feel free to enter a JIRA if this is not clear enough.

<http://hadoop.apache.org/zookeeper/docs/r3.1.1/zookeeperProgrammers.html#ch_zkWatches>Alexis


This (3.1.1) is a pretty old version of the docs, I'd suggest that you look at the most recent before entering JIRAs:

http://hadoop.apache.org/zookeeper/docs/current/zookeeperProgrammers.html#ch_zkWatches

Regards,

Patrick

On Fri, Jun 25, 2010 at 12:36 PM, Patrick Hunt <ph...@apache.org
<mailto:ph...@apache.org>> wrote:



    On 06/12/2010 10:07 PM, Alexis Midon wrote:

        I implemented queues and locks on top of ZooKeeper, and I'm
        pretty happy so
        far. Thanks for the nice work. Tests look good. So good that we
        can focus on
        exception/error handling and I got a couple of questions.

        #1. Regarding the use of the default watcher. A ZooKeeper
        instance has a
        default watcher, most operations can also specify a watcher.
        When both are
        set, does the operation watcher override the default watcher?


    if you use the get(path, bool) then the default watcher is notified,
    if you use get(path, watcherX) then only "watcherX" is notified.


          or will both watchers be invoked? if so in which order? Does
        each watcher
        receive all the types of event?


    no, both watchers are not invoked.


        I had a look at the code, and my understanding is that the
        default watcher
        will always receive the type-NONE events, even if an "operation"
        watcher is
        set. No guarantee on the order of invocation though. Could you
        confirm
        and/or complete please?


    The watcher gets both state change notifications and watch events.
    You can register multiple watchers for the same path (incl the
    default), there is no guarantee on ordering at all.


        #2 After a connection loss, the client will eventually reconnect
        to the ZK
        cluster so I guess I can keep using the same client instance.
        But are there


    right


        cases where it is necessary to re-instantiate a ZooKeeper
        client? As a first
        recovery-strategy, is that ok to always recreate a client so
        that any
        ephemeral node previously owned disappear?


    if the session is expired that's the case you need to recreate the
    session object (or if you explicitly close).

    Yes, this is a fine strategy if your application domain "fits". If
    you have a very expensive "recovery" or "bootstrap" process then
    recreating the session on every disconnect would be a bad idea.


        The case I struggle with is the following:
        Let's say I've acquired a lock (i.e. an ephemeral locknode is
        created).
        Some application logic failed due to a connection loss. At this
        stage I'd
        like to give up/roll back. Here I would typically throw an
        exception, the
        lock being released in a finally. But I can't release the lock
        since the
        connection is down. Later the client eventually reconnects, the
        session
        didn't expire so the locknode still exists. Now no one else can
        acquire this
        lock until my session expires.


    Yes, you are reading the situation correctly. In this case you
    either have to take the easy route - close the session and create a
    new one (again, if your app domain supports this) or your client
    needs to check if the lock is still being held (it's still the
    owner) when it's eventually reconnected. You can verify this for an
    ephemeral node by looking at the "ephemeralOwner" field of the Stat
    object. If this matches your session id then you are the owner and
    still hold the lock. This is a bit tricky to get right though, so in
    some cases clients just close the session and recreate.



        #3. could you describe the recommended actions for each
        exception code?


    this is highly dependent on your application requirements. See above
    for my general information. ff to ask more questions.

    Regards,

    Patrick


Reply via email to