[ https://issues.apache.org/jira/browse/ZOOKEEPER-885?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=12920655#action_12920655 ]
Patrick Hunt commented on ZOOKEEPER-885: ---------------------------------------- I notice this in the server log (continuous log): 2010-10-08 08:46:24,789 - DEBUG [ProcessThread:-1:commitproces...@169] - Processing request:: sessionid:0x32b8b03e21e000c type:ping cxid:0xfffffffffffffffe zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a 2010-10-08 08:46:24,789 - DEBUG [CommitProcessor:3:finalrequestproces...@78] - Processing request:: sessionid:0x32b8b03e21e000c type:ping cxid:0xfffffffffffffffe zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a 2010-10-08 08:46:24,789 - DEBUG [CommitProcessor:3:finalrequestproces...@160] - sessionid:0x32b8b03e21e000c type:ping cxid:0xfffffffffffffffe zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a 2010-10-08 08:46:27,986 - DEBUG [ProcessThread:-1:commitproces...@169] - Processing request:: sessionid:0x32b8b03e21e0001 type:ping cxid:0xfffffffffffffffe zxid:0xfffffffffffffffe txntype:unknown reqpath:n/a 2010-10-08 08:46:38,000 - INFO [SessionTracker:zookeeperser...@315] - Expiring session 0x32b8b03e21e0004, timeout of 10000ms exceeded 2010-10-08 08:46:46,471 - INFO [SessionTracker:zookeeperser...@315] - Expiring session 0x32b8b03e21e0003, timeout of 10000ms exceeded 2010-10-08 08:47:00,083 - INFO [SessionTracker:zookeeperser...@315] - Expiring session 0x32b8b03e21e0001, timeout of 10000ms exceeded It looks to me like the pipeline is getting stalled after 0x32b8b03e21e000c. Notice that 0x32b8b03e21e0001 sends in a ping which gets to the commit processor, but you never see the "final request processor" messages (before the session expires). Notice that 30seconds are elapsing here. > Zookeeper drops connections under moderate IO load > -------------------------------------------------- > > Key: ZOOKEEPER-885 > URL: https://issues.apache.org/jira/browse/ZOOKEEPER-885 > Project: Zookeeper > Issue Type: Bug > Components: server > Affects Versions: 3.2.2, 3.3.1 > Environment: Debian (Lenny) > 1Gb RAM > swap disabled > 100Mb heap for zookeeper > Reporter: Alexandre Hardy > Priority: Critical > Attachments: tracezklogs.tar.gz, tracezklogs.tar.gz, > WatcherTest.java, zklogs.tar.gz > > > A zookeeper server under minimum load, with a number of clients watching > exactly one node will fail to maintain the connection when the machine is > subjected to moderate IO load. > In a specific test example we had three zookeeper servers running on > dedicated machines with 45 clients connected, watching exactly one node. The > clients would disconnect after moderate load was added to each of the > zookeeper servers with the command: > {noformat} > dd if=/dev/urandom of=/dev/mapper/nimbula-test > {noformat} > The {{dd}} command transferred data at a rate of about 4Mb/s. > The same thing happens with > {noformat} > dd if=/dev/zero of=/dev/mapper/nimbula-test > {noformat} > It seems strange that such a moderate load should cause instability in the > connection. > Very few other processes were running, the machines were setup to test the > connection instability we have experienced. Clients performed no other read > or mutation operations. > Although the documents state that minimal competing IO load should present on > the zookeeper server, it seems reasonable that moderate IO should not cause > problems in this case. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.