On 01.10.2009, at 08:57, Patrick Hunt wrote:
I started looking a bit more closely at the source, some questions:
1) I tried generating the javadocs (see my fork of the project on
github if you want my changes to build.xml for this) but it looks
like there's pretty much no javadoc. Some information, particularly
on semantics of user-exposed operations would be useful (esp re my
earlier README comment - some high level document describing the
benefits, etc... of the library)
If I'm your proto-typical "lazy developer" (which I am :-) ), I'm
really expecting some helpful docs to get me bootstrapped.
2) what purpose does ZkEventThread serve?
ZkClient updates it's connection state from the ZooKeeper events.
Based on these it notifies listeners, updates it's connection state or
reconnects to ZooKeeper. ZkClient has its own event thread to prevent
dead-locks. When a listener blocks (because it waits until ZkClient
has reconnected to Zookeeper), ZkClient wouldn't be able to receive
the reconnect event from ZooKeeper anymore, if we had re-used the
Zookeeper event thread to notifier listeners. See the javadoc for
ZkEventThread for more information.
3) there's definitely an issue in the retryUntilConnected logic that
you need to address
let's say you call zkclient.create, and the connection to the server
is lost while the request is in flight. At this point ConnectionLoss
is thrown on the client side, however you (client) have no
information on whether the server has made the change or not. The
retry method's while loop will re-run the create (after reconnect),
and the result seen by the caller (user code) could be either OK or
may be NODEEXISTS exception, there's no way to know which.
Mahadev is working on ZOOKEEPER-22 which will address this issue,
but that's a future version, not today.
Good catch. I wasn't aware that nodes could still be have been created
when receiving a ConnectionLoss. But how would you deal with that?
If we create a znode and get a ConnectionLoss exception, then wait
until the connection is back and check if the znode is there. There is
no way of knowing whether it was us who created the node or somebody
Anyway. That's definitely a design issue.
4) when I saw that you had separated zkclient and zkconnection I
thought "ah, this is interesting" however when I saw the
implementation I was confused:
a) what purpose does this separation serve?
It's just to have all ZooKeeper communication in one place, where the
higher lever stuff is in ZkClient. That way we are able to provide an
in-memory ZkConnection implementation that doesn't connect to a real
ZooKeeper. This could be used for easier testing.
b) I thought it was to allow multiple zkclients to share a single
connection, however looking at zkclient.close, it closes the
Actually each ZkClient instance maintains one ZooKeeper connection.
5) there's a lot of wrapping of exceptions, looks like this is done
in order to make them unchecked. Is this wise? How much "simpler"
does it really make things? Esp things like interrupted exception?
As you mentioned, one of your intents is to simplify things, but
perhaps too simple? Some short, clear examples of usage would be
helpful here to compare/contrast, I took a very quick look at some
of the tests but that didn't help much. Is there a test(s) in
particular that I should look at to see how zkclient is used, and
the benefits incurred?
Checked exceptions are very painful when you are assembling together a
larger number of libraries (which is true for most enterprise
applications). Either you wind up having a general "throws
Exception" (which I don't really like, because it's too general) at
most of your interfaces, or you have to wrap checked exceptions into
We didn't want a library to introduce yet another checked exception
that you MUST catch or rethrow. I know that there are different
opinions about that, but that's the idea behind this.
Similar situation for the InterruptedException. ZkClient also converts
this to a runtime exception and makes sure that the interrupted flag
doesn't get cleared. There are just too many existing libraries that
have a "catch (Exception e)" somewhere that totally ignores that this
would reset the interrupt flag, if e is an InterruptedException.
Therefore we better avoid having all of the methods throwing that
Thanks a lot for the valuable feedback,
Patrick Hunt wrote:
Hi Stefan, two suggestions off the bat:
1) fill in something in the README, doesn't have to be final or
polished, but give some insight into the what/why/how/where/goals/
etc... to get things moving quickly for reviewers & new users.
2) you should really discuss on the dev list. It's up to you to
include user, but apache discourages use of user for development
discussion (plus you'll pickup more developer insight there)
Stefan Groschupf wrote:
Hi Zookeeper developer,
it would be great if you guys could give us some feedback about
our project zkclient.
The main idea is making the life of lazy developers that only want
minimal zk functionality much easier.
We have a functionality like zkclient mock making testing easy and
fast without running a real zkserver, simple call back interfaces
for the different event types, reconnecting handling in case of
We feel we come closer to a release so it would be great if some
experts could have a look and give us some feedback.
Hadoop training and consulting