Hi Mahadev, Thanks for you reply. Yes, I did check the example implementation. However, it is not suitable for my purpose as such because it needs a separate ZooKeeper instance for each WriteLock. I want to support multiple threads through one ZooKeeper instance (don't want to pay for connect overhead every time just to get a lock). Is this approach bad?
I'm not sure, but there seems to be a possibility of starvation in the example code (locks are ordered based on session ID rather than znode id). I'll open a ticket for this. Thanks, -Jaakko On Fri, Jan 15, 2010 at 2:36 AM, Mahadev Konar <maha...@yahoo-inc.com> wrote: > Hi Jaakko, > The lock recipe has already been implemented in zookeeper under > src/recipes/lock (version 3.* I think). It has code to deal with > connectionloss as well. I would suggest that you use the recipe. You can > file jira's in case you see some shortcomings/bugs in the code. > > > Thanks > mahadev > > > On 1/14/10 1:32 AM, "Jaakko" <rosvopaalli...@gmail.com> wrote: > >> Hi, >> >> I'm trying to provide mutex services through a singleton class >> (methods lock and unlock). It basically follows lock recipe, but I'm >> having a problem how to handle connection loss properly if it happens >> during mutex wait: >> >> pseudocode/snippet: >> >> >> public class SingletonMutex implements Watcher >> { >> private Integer mutex; >> private ZooKeeper zk; >> >> public void process(WatchedEvent event) >> { >> synchronized (mutex) >> { >> mutex.notifyAll(); >> } >> } >> >> private String lock() >> { >> <create ephemeral znode> >> <find children and do related checks> >> >> if (there_is_somebody_with_smaller_number) >> { >> mutex.wait(); >> >> if (isConnected() == false) >> throw new Exception("foobar"); >> } >> } >> } >> >> Question is: If there is a server disconnect during mutex.wait, the >> thread will wake up without having any means to continue (or delete >> the znode), so it throws an exception. However, if this is only due to >> connection loss (and not session expire), the lock znode it has >> created previously will not be deleted, thus resulting in a deadlock. >> One way would be to use sessionId in lock name and check if we already >> have the lock when entering lock method. However, since this is >> singleton class and used by multiple threads, that approach won't >> work. Using thread ID for this purpose or some form of internal >> bookkeeping is also not very robust. Currently I just do zk.close and >> get another instance on connection loss, which seems to solve the >> problem. >> >> Is there any other way to do this? >> >> Thanks for any comments/suggestions, >> >> -Jaakko > >