We are having similar problems to this. At the moment we wrap ZooKeeper in a class which retries requests on KeeperException.ConnectionLoss to avoid no watcher being added, but we are worried that this may result in multiple watchers being added if the watcher is successfully added but the server returns a Connection Loss
Colin -----Original Message----- From: Eric Bowman [mailto:ebow...@boboco.ie] Sent: 01 February 2010 10:22 To: zookeeper-user@hadoop.apache.org Subject: Re: how to handle re-add watch fails I was surprised to not get a response to this ... is this a no-brainer? Too hard to solve? Did I not express it clearly? Am I doing something dumb? :) Thanks, Eric On 01/25/2010 01:05 PM, Eric Bowman wrote: > I'm curious, what is the "best practice" for how to handle the case > where re-adding a watch inside a Watcher.process callback fails? > > I keep stumbling upon the same kind of thing, and the possibility of > race conditions or undefined behavior keep troubling me. Maybe I'm > missing something. > > Suppose I have a callback like: > > public void process( WatchedEvent watchedEvent ) > { > if ( watchedEvent.getType() == > Event.EventType.NodeChildrenChanged ) { > try { > ... do stuff ... > } > catch ( Throwable e ) { > log.error( "Could not do stuff!", e ); > } > try { > zooKeeperManager.watchChildren( zPath, this ); > } > catch ( InterruptedException e ) { > log.info( "Interrupted adding watch -- shutting down?" ); > return; > } > catch ( KeeperException e ) { > // oh crap, now what? > } > } > } > > (In this cases, watchChildren is just calling getChildren and passing > the watcher in.) > > It occurs to me I could get more and more complicated here: I could > wrap watchChildren in a while loop until it succeeds, but that seems > kind of rude to the caller. Plus what if I get a > KeeperException.SessionExpiredException or a > KeeperException.ConnectionLossException? How to handle that in this > loop? Or I could send some other thread a message that it needs to keep > trying until the watch has been re-added ... but ... yuck. > > I would very much like to just setup this watch once, and have ZooKeeper > make sure it keeps firing until I tear down ZooKeeper -- this logic > seems tricky for clients, and quite error prone and full of race conditions. > > Any thoughts? > > Thanks, > Eric > > -- Eric Bowman Boboco Ltd ebow...@boboco.ie http://www.boboco.ie/ebowman/pubkey.pgp +35318394189/+353872801532 This message should be regarded as confidential. If you have received this email in error please notify the sender and destroy it immediately. Statements of intent shall only become binding when confirmed in hard copy by an authorised signatory. The contents of this email may relate to dealings with other companies within the Detica Limited group of companies. Detica Limited is registered in England under No: 1337451. Registered offices: Surrey Research Park, Guildford, Surrey, GU2 7YP, England.