Hi Ted,
> These problems seem to manifest around getting lots of anomalous disconnects
> and session expirations even though we have the timeout values set to 2
> seconds on the server side and 5 seconds on the client side.
> 

 Your scenario might be a little differetn from what Nitay (Hbase) is
seeing. In their scenario the zookeeper client was not able to send out
pings to the server due to gc stalling threads in their zookeeper
application process.

The latencies in zookeeper clients are directly related to Zookeeper server
machines. It is very much dependant on the disk io latencies that you would
get on the zookeeper servers and network latencies with your cluster.

I am not sure how much sensitive you want your zookeeper application to be
-- but increasing the timeout should help. Also, we recommend using
dedicated disk for zookeeper log transactions.

http://hadoop.apache.org/zookeeper/docs/r3.1.1/zookeeperAdmin.html#sc_streng
thsAndLimitations

Also, we have seen Ntp having problems and clocks going back on one of our
vm setup. This would lead to session getting timed out earler than the set
session timeout.

I hope this helps.


mahadev

On 4/14/09 5:48 PM, "Ted Dunning" <ted.dunn...@gmail.com> wrote:

> We have been using EC2 as a substrate for our search cluster with zookeeper
> as our coordination layer and have been seeing some strange problems.
> 
> These problems seem to manifest around getting lots of anomalous disconnects
> and session expirations even though we have the timeout values set to 2
> seconds on the server side and 5 seconds on the client side.
> 
> Has anybody else been seeing this?
> 
> Is this related to clock jumps in a virtualized setting?
> 
> On a related note, what is best practice for handling session expiration?
> Just deal with it as if it is a new start?

Reply via email to