[ 
https://issues.apache.org/jira/browse/ZOOKEEPER-885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12921226#action_12921226
 ] 

Patrick Hunt edited comment on ZOOKEEPER-885 at 10/15/10 1:17 AM:
------------------------------------------------------------------

Alexandre, Can you provide some additional detail:

1) you are applying the load using dd to all three servers at the same time, is 
that correct? (not just to 1 server)

2) /dev/mapper indicates some sort of lvm setup, can you give more detail on 
that? (fyi http://ubuntuforums.org/showthread.php?t=646340)

3) you mentioned that this:

{quote}
echo 5 > /proc/sys/vm/dirty_ratio
echo 5 > /proc/sys/vm/dirty_background_ratio
{quote}
 
resulting in "stability in this test", can you tell us what this was set to 
initially?

Checkout this article: http://lwn.net/Articles/216853/

I notice you are running a "bigmem" kernel. What's the total memory size? How 
large of a heap have to assigned to the ZK server? (jvm)

4) Can you verify whether or not the JVM is swapping? Any chance that the 
server JVM is swapping, which is causing the server to pause, which then causes 
the clients to time out? This seems to me like it would fit the scenario - esp 
given that when you turn the "dirty_ratio" down you see stability increase (the 
time it would take to complete the flush would decrease, meaning that the 
server can respond before the client times out).





      was (Author: phunt):
    Alexandre, Can you provide some additional detail:

1) you are applying the load using dd to all three servers at the same time, is 
that correct? (not just to 1 server)

2) /dev/mapper indicates some sort of lvm setup, can you give more detail on 
that? (fyi http://ubuntuforums.org/showthread.php?t=646340)

3) you mentioned that this:

{quote}
echo 5 > /proc/sys/vm/dirty_ratio
echo 5 > /proc/sys/vm/dirty_background_ratio
{quote}
 
resulting in "stability in this test", can you tell us what this was set to 
initially?


  
> Zookeeper drops connections under moderate IO load
> --------------------------------------------------
>
>                 Key: ZOOKEEPER-885
>                 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-885
>             Project: Zookeeper
>          Issue Type: Bug
>          Components: server
>    Affects Versions: 3.2.2, 3.3.1
>         Environment: Debian (Lenny)
> 1Gb RAM
> swap disabled
> 100Mb heap for zookeeper
>            Reporter: Alexandre Hardy
>            Priority: Critical
>         Attachments: tracezklogs.tar.gz, tracezklogs.tar.gz, 
> WatcherTest.java, zklogs.tar.gz
>
>
> A zookeeper server under minimum load, with a number of clients watching 
> exactly one node will fail to maintain the connection when the machine is 
> subjected to moderate IO load.
> In a specific test example we had three zookeeper servers running on 
> dedicated machines with 45 clients connected, watching exactly one node. The 
> clients would disconnect after moderate load was added to each of the 
> zookeeper servers with the command:
> {noformat}
> dd if=/dev/urandom of=/dev/mapper/nimbula-test
> {noformat}
> The {{dd}} command transferred data at a rate of about 4Mb/s.
> The same thing happens with
> {noformat}
> dd if=/dev/zero of=/dev/mapper/nimbula-test
> {noformat}
> It seems strange that such a moderate load should cause instability in the 
> connection.
> Very few other processes were running, the machines were setup to test the 
> connection instability we have experienced. Clients performed no other read 
> or mutation operations.
> Although the documents state that minimal competing IO load should present on 
> the zookeeper server, it seems reasonable that moderate IO should not cause 
> problems in this case.

-- 
This message is automatically generated by JIRA.
-
You can reply to this email to add a comment to the issue online.

Reply via email to