Re: Pathological ZK cluster: 1 server verbosely WARN'ing, other 2 servers pegging CPU

Patrick Hunt Wed, 28 Apr 2010 23:04:16 -0700

Btw, are you monitoring the ZK server jvms? Please take a look at


http://hadoop.apache.org/zookeeper/docs/current/zookeeperAdmin.html#sc_zkCommands

It would be interesting if you could run commmands such as "stat"against your currently running cluster. In particular I'd be interestedto know what you see for latency and "node count" in the stat response.Run this command against each of your servers in the ensemble. Forexample high "max latency" might indicate that something (usuallyswap/gc) is causing the server to respond slowly in some cases.


Patrick

On 04/28/2010 10:47 PM, Patrick Hunt wrote:

Hi Aaron, some questions/comments below:

On 04/28/2010 06:29 PM, Aaron Crow wrote:

We were running version 3.2.2 for about a month and it was working
well for
us. Then late this past Saturday night, our cluster went pathological.
One
of the 3 ZK servers spewed many WARNs (see below), and the other 2
servers
were almost constantly pegging the CPU. All three servers are on separate
machines. From what we could tell, the machines were fine... networking
fine, disk fine, etc. The ZK clients were completely unable to complete
their connections to ZK.


These machines are local (not wan) connected then? What OS and java
version are you using?

Do you see any FATAL or ERROR level messages in the logs?

It would help to look at your zk config files for these servers. Could
you provide (you might want to create a JIRA first, then just attach
configs and other details/collateral to that, easier than dealing with
email)

If you have logs for the time period and can share that would be most
useful. (again, gzip and attach to the jira)

We tried all sorts of restarts, running zkCleanup, etc. We even
completely
shut down our clients... and the pathology continued. Our workaround
was to
do an urgent upgrade to version 3.3.0. The new ZK cluster with 3.3.0 has
been running well for us... so far...


Off hand and with the data we have so far nothing sticks out that 3.3
would have resolved (JIRA is conveniently down for the last hour or so
so I can't review right now). Although there were some changes to reduce
memory consumption (see below).

I realize that, sadly, this message doesn't contain nearly enough
details to
trace exactly what happened. I guess I'm wondering if anyone has seen
this
general scenario, and/or knows how to prevent? Is there anything we
might be
doing client side to trigger this? Our application level request
frequency
is maybe a few requests to Zookeeper per second, times 5 clients
applications. If we detect a SESSION EXPIRED, we do a simple "create new
client and use that instead". And we were seeing this happen
occasionally.


What are the client doing? Do you have a large number/size of znodes?

Do you see any OutOfMemoryError in the logs?

Could the ZK server java process be swapping? Are you monitoring GC,
perhaps large GC pauses are happening?

I have a suspicion that one of a few things might be happening. I see
the following in your original email:

 > :followerhand...@302] - Sending snapshot last zxid of peer is
0xd0007d66d
 > zxid of leader is 0xf00000000
 > 2010-04-24 23:06:03,254 - ERROR [FollowerHandler-/10.0.10.116:34405
 > :followerhand...@415] - Unexpected exception causing shutdown while sock
 > still open
 > java.net.SocketException: Broken pipe
 > at java.net.SocketOutputStream.socketWrite0(Native Method)
 > at

^^^^ this log section is saying that a ZK server (follower) connected to
the elected leader and started to download (leader is sending here) the
latest ZK snapshot (db) to the follower while it is recovering. However
before the follower fully downloads the snapshot the connection (pipe)
is broken and this operation fails. At this point the leader itself will
probably give up the lead, a new election will happen, and it might be
the case that this same problem occurs - the follower tries to download
the snapshot but it again fails (and this loops forever)

This issue can happen if the servers are slowed, either by slow
interconnects (wan), network connectivity/perf problems, swapping of the
JVM, and most commonly GC in the VM.

See this recent case:
http://www.mail-archive.com/zookeeper-...@hadoop.apache.org/msg08083.html

What is the size of your snapshot file? (you can see this in your
zookeeper datadirectory, unless you wiped this during your attempt to
cleanup/recover/upgrade?).

Seeing your config file, and getting a sense of the snapshot file size
will give us more insight.

Patrick

Many, many thanks in advance for any insights or advice.


Example log output from the ZK server that spewed many WARNs:

===

2010-04-24 22:31:24,551 - WARN
[QuorumPeer:/0:0:0:0:0:0:0:0:2181:follo...@318] - Exception when
following
the leader
java.net.SocketTimeoutException: Read timed out
at java.net.SocketInputStream.socketRead0(Native Method)
at java.net.SocketInputStream.read(SocketInputStream.java:129)
at java.io.BufferedInputStream.fill(BufferedInputStream.java:218)
at java.io.BufferedInputStream.read(BufferedInputStream.java:237)
at java.io.DataInputStream.readInt(DataInputStream.java:370)
at org.apache.jute.BinaryInputArchive.readInt(BinaryInputArchive.java:63)
at
org.apache.zookeeper.server.quorum.QuorumPacket.deserialize(QuorumPacket.java:66)

at
org.apache.jute.BinaryInputArchive.readRecord(BinaryInputArchive.java:108)

at
org.apache.zookeeper.server.quorum.Follower.readPacket(Follower.java:114)
at
org.apache.zookeeper.server.quorum.Follower.followLeader(Follower.java:193)

at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:525)
2010-04-24 22:31:24,552 - INFO
[QuorumPeer:/0:0:0:0:0:0:0:0:2181:follo...@436] - shutdown called
java.lang.Exception: shutdown Follower
at
org.apache.zookeeper.server.quorum.Follower.shutdown(Follower.java:436)
at org.apache.zookeeper.server.quorum.QuorumPeer.run(QuorumPeer.java:529)
2010-04-24 22:31:24,552 - INFO
[QuorumPeer:/0:0:0:0:0:0:0:0:2181:quorump...@514] - LOOKING

...

2010-04-24 22:31:27,257 - WARN
[NIOServerCxn.Factory:2181:nioserverc...@518]
- Exception causing close of session 0x0 due to java.io.IOException:
ZooKeeperServer not running
2010-04-24 22:31:27,257 - INFO
[NIOServerCxn.Factory:2181:nioserverc...@857]
- closing session:0x0 NIOServerCnxn:
java.nio.channels.SocketChannel[connected local=/127.0.0.1:2181remote=/
127.0.0.1:44589]
2010-04-24 22:31:27,705 - INFO
[QuorumPeer:/0:0:0:0:0:0:0:0:2181:fastleaderelect...@706] - Notification:
3, 60129542145, 2, 1, LOOKING, LEADING, 3
2010-04-24 22:31:27,705 - INFO
[QuorumPeer:/0:0:0:0:0:0:0:0:2181:quorump...@523] - FOLLOWING
2010-04-24 22:31:27,705 - INFO
[QuorumPeer:/0:0:0:0:0:0:0:0:2181:zookeeperser...@160] - Created server
2010-04-24 22:31:27,706 - INFO
[QuorumPeer:/0:0:0:0:0:0:0:0:2181:follo...@147] - Following sum08/
10.0.10.118:2888
2010-04-24 22:31:29,040 - WARN
[NIOServerCxn.Factory:2181:nioserverc...@518]
- Exception causing close of session 0x0 due to java.io.IOException:
ZooKeeperServer not running
2010-04-24 22:31:29,040 - INFO
[NIOServerCxn.Factory:2181:nioserverc...@857]
- closing session:0x0 NIOServerCnxn:
java.nio.channels.SocketChannel[connected local=/127.0.0.1:2181remote=/
127.0.0.1:44591]
2010-04-24 22:31:31,069 - WARN
[NIOServerCxn.Factory:2181:nioserverc...@518]
- Exception causing close of session 0x0 due to java.io.IOException:
ZooKeeperServer not running
2010-04-24 22:31:31,069 - INFO
[NIOServerCxn.Factory:2181:nioserverc...@857]
- closing session:0x0 NIOServerCnxn:
java.nio.channels.SocketChannel[connected local=/127.0.0.1:2181remote=/
127.0.0.1:44592]
2010-04-24 22:31:32,589 - WARN
[NIOServerCxn.Factory:2181:nioserverc...@518]
- Exception causing close of session 0x0 due to java.io.IOException:
ZooKeeperServer not running

===

Example log output from a server that was pegging CPU:

010-04-24 23:05:27,911 - INFO [WorkerReceiver
Thread:fastleaderelection$messenger$workerrecei...@254] - Sending new
notification.
2010-04-24 23:05:27,912 - INFO [WorkerReceiver
Thread:fastleaderelection$messenger$workerrecei...@254] - Sending new
notification.
2010-04-24 23:05:27,910 - WARN
[Thread-15:quorumcnxmanager$sendwor...@539]
- Interrupted while waiting for message on queue
java.lang.InterruptedException
at
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.reportInterruptAfterWait(AbstractQueuedSynchronizer.java:1899)

at
java.util.concurrent.locks.AbstractQueuedSynchronizer$ConditionObject.await(AbstractQueuedSynchronizer.java:1934)

at
java.util.concurrent.ArrayBlockingQueue.take(ArrayBlockingQueue.java:317)
at
org.apache.zookeeper.server.quorum.QuorumCnxManager$SendWorker.run(QuorumCnxManager.java:533)

2010-04-24 23:05:27,912 - INFO [WorkerReceiver
Thread:fastleaderelection$messenger$workerrecei...@254] - Sending new
notification.
2010-04-24 23:05:27,912 - WARN
[Thread-16:quorumcnxmanager$recvwor...@623]
- Connection broken:
java.nio.channels.AsynchronousCloseException
at
java.nio.channels.spi.AbstractInterruptibleChannel.end(AbstractInterruptibleChannel.java:185)

at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:263)
at
org.apache.zookeeper.server.quorum.QuorumCnxManager$RecvWorker.run(QuorumCnxManager.java:594)

2010-04-24 23:05:27,912 - WARN
[Thread-15:quorumcnxmanager$sendwor...@554]
- Send worker leaving thread

...

010-04-24 23:05:51,952 - INFO [WorkerReceiver
Thread:fastleaderelection$messenger$workerrecei...@254] - Sending new
notification.
2010-04-24 23:05:57,148 - INFO [WorkerReceiver
Thread:fastleaderelection$messenger$workerrecei...@254] - Sending new
notification.
2010-04-24 23:06:03,243 - INFO [FollowerHandler-/10.0.10.116:34405
:followerhand...@227] - Follower sid: 1 : info :
org.apache.zookeeper.server.quorum.quorumpeer$quorumser...@364641
2010-04-24 23:06:03,244 - WARN [FollowerHandler-/10.0.10.116:34405
:followerhand...@302] - Sending snapshot last zxid of peer is 0xd0007d66d
zxid of leader is 0xf00000000
2010-04-24 23:06:03,254 - ERROR [FollowerHandler-/10.0.10.116:34405
:followerhand...@415] - Unexpected exception causing shutdown while sock
still open
java.net.SocketException: Broken pipe
at java.net.SocketOutputStream.socketWrite0(Native Method)
at
java.net.SocketOutputStream.socketWrite(SocketOutputStream.java:92)
at java.net.SocketOutputStream.write(SocketOutputStream.java:136)
at
java.io.BufferedOutputStream.flushBuffer(BufferedOutputStream.java:65)
at java.io.BufferedOutputStream.write(BufferedOutputStream.java:109)
at java.io.DataOutputStream.writeLong(DataOutputStream.java:207)
at
org.apache.jute.BinaryOutputArchive.writeLong(BinaryOutputArchive.java:59)

at org.apache.zookeeper.server.DataNode.serialize(DataNode.java:129)
at
org.apache.jute.BinaryOutputArchive.writeRecord(BinaryOutputArchive.java:123)

at
org.apache.zookeeper.server.DataTree.serializeNode(DataTree.java:917)
at
org.apache.zookeeper.server.DataTree.serializeNode(DataTree.java:929)
at
org.apache.zookeeper.server.DataTree.serializeNode(DataTree.java:929)
at
org.apache.zookeeper.server.DataTree.serializeNode(DataTree.java:929)
at
org.apache.zookeeper.server.DataTree.serializeNode(DataTree.java:929)
at org.apache.zookeeper.server.DataTree.serialize(DataTree.java:979)
at
org.apache.zookeeper.server.util.SerializeUtils.serializeSnapshot(SerializeUtils.java:104)

at
org.apache.zookeeper.server.ZooKeeperServer.serializeSnapshot(ZooKeeperServer.java:272)

at
org.apache.zookeeper.server.quorum.FollowerHandler.run(FollowerHandler.java:307)

2010-04-24 23:06:03,255 - WARN [FollowerHandler-/10.0.10.116:34405
:followerhand...@421] - ******* GOODBYE /10.0.10.116:34405 ********
2010-04-24 23:06:03,550 - INFO [WorkerReceiver
Thread:fastleaderelection$messenger$workerrecei...@254] - Sending new
notification.

Re: Pathological ZK cluster: 1 server verbosely WARN'ing, other 2 servers pegging CPU

Reply via email to