Alexandre Hardy commented on ZOOKEEPER-885:

Hi Alexandre, When you load the machines running the zookeeper servers by 
running the dd command, how much time elapses between running dd and observing 
the connections expiring? I'm not being able to reproduce it, and I wonder how 
long the problem takes to manifest.

Hi Flavio,

Problems usually start occurring after about 30 seconds. I have also tested on 
some other machines and response varies somewhat. I suspect that the 30 seconds 
is dictated by some queue that needs to fill up before significant disk traffic 
is initiated. I think that the speed at which this queue is processed 
determines how likely it is that zookeeper will fail to respond to a ping.

Please also see responses to Patrick's questions.

> Zookeeper drops connections under moderate IO load
> --------------------------------------------------
>                 Key: ZOOKEEPER-885
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-885
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.2.2, 3.3.1
>         Environment: Debian (Lenny)
> 1Gb RAM
> swap disabled
> 100Mb heap for zookeeper
>            Reporter: Alexandre Hardy
>            Priority: Critical
>         Attachments: tracezklogs.tar.gz, tracezklogs.tar.gz, 
> WatcherTest.java, zklogs.tar.gz
> A zookeeper server under minimum load, with a number of clients watching 
> exactly one node will fail to maintain the connection when the machine is 
> subjected to moderate IO load.
> In a specific test example we had three zookeeper servers running on 
> dedicated machines with 45 clients connected, watching exactly one node. The 
> clients would disconnect after moderate load was added to each of the 
> zookeeper servers with the command:
> {noformat}
> dd if=/dev/urandom of=/dev/mapper/nimbula-test
> {noformat}
> The {{dd}} command transferred data at a rate of about 4Mb/s.
> The same thing happens with
> {noformat}
> dd if=/dev/zero of=/dev/mapper/nimbula-test
> {noformat}
> It seems strange that such a moderate load should cause instability in the 
> connection.
> Very few other processes were running, the machines were setup to test the 
> connection instability we have experienced. Clients performed no other read 
> or mutation operations.
> Although the documents state that minimal competing IO load should present on 
> the zookeeper server, it seems reasonable that moderate IO should not cause 
> problems in this case.

This message is automatically generated by JIRA.
You can reply to this email to add a comment to the issue online.

Reply via email to