[
https://issues.apache.org/jira/browse/YARN-3795?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14581553#comment-14581553
]
zhihai xu commented on YARN-3795:
---------------------------------
Hi [~lachisis], thanks for reporting this issue.
Most likely, Broken pipe is due to Len error at ZooKeeper server.
To confirm this, Could you check the ZooKeeper server logs to see whether you
can find the following log:
{code}
WARN org.apache.zookeeper.server.NIOServerCnxn: Exception causing close of
session 0x???????? due to java.io.IOException: Len error ???????
{code}
You can work around the Len error issue by increasing jute.maxbuffer size at
ZooKeeper server or you can try YARN-3469.
> ZKRMStateStore crashes due to IOException: Broken pipe
> ------------------------------------------------------
>
> Key: YARN-3795
> URL: https://issues.apache.org/jira/browse/YARN-3795
> Project: Hadoop YARN
> Issue Type: Bug
> Components: resourcemanager
> Affects Versions: 2.5.0
> Reporter: lachisis
> Priority: Critical
>
> 2015-06-05 06:06:54,848 INFO org.apache.zookeeper.ClientCnxn: Socket
> connection established to dap88/134.41.33.88:2181, initiating session
> 2015-06-05 06:06:54,876 INFO org.apache.zookeeper.ClientCnxn: Session
> establishment complete on server dap88/134.41.33.88:2181, sessionid =
> 0x34db2f72ac50c86, negotiated timeout = 10000
> 2015-06-05 06:06:54,881 INFO
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore:
> Watcher event type: None with state:SyncConnected for path:null for Service
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore in state
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: STARTED
> 2015-06-05 06:06:54,881 INFO
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore:
> ZKRMStateStore Session connected
> 2015-06-05 06:06:54,881 INFO
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore:
> ZKRMStateStore Session restored
> 2015-06-05 06:06:54,881 WARN org.apache.zookeeper.ClientCnxn: Session
> 0x34db2f72ac50c86 for server dap88/134.41.33.88:2181, unexpected error,
> closing socket connection and attempting reconnect
> java.io.IOException: Broken pipe
> at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
> at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
> at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:94)
> at sun.nio.ch.IOUtil.write(IOUtil.java:65)
> at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:450)
> at
> org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:117)
> at
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:355)
> at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1075)
> 2015-06-05 06:06:54,986 INFO
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore:
> Watcher event type: None with state:Disconnected for path:null for Service
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore in state
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: STARTED
> 2015-06-05 06:06:54,986 INFO
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore:
> ZKRMStateStore Session disconnected
> 2015-06-05 06:06:55,278 INFO org.apache.zookeeper.ClientCnxn: Opening socket
> connection to server dap87/134.41.33.87:2181. Will not attempt to
> authenticate using SASL (unknown error)
> 2015-06-05 06:06:55,278 INFO org.apache.zookeeper.ClientCnxn: Socket
> connection established to dap87/134.41.33.87:2181, initiating session
> 2015-06-05 06:06:55,330 INFO org.apache.zookeeper.ClientCnxn: Session
> establishment complete on server dap87/134.41.33.87:2181, sessionid =
> 0x34db2f72ac50c86, negotiated timeout = 10000
> 2015-06-05 06:06:55,343 INFO
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore:
> Watcher event type: None with state:SyncConnected for path:null for Service
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore in state
> org.apache.hadoop.yarn.server.resourcemanager.recovery.RMStateStore: STARTED
> 2015-06-05 06:06:55,343 INFO
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore:
> ZKRMStateStore Session connected
> 2015-06-05 06:06:55,344 INFO
> org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore:
> ZKRMStateStore Session restored
> 2015-06-05 06:06:55,345 WARN org.apache.zookeeper.ClientCnxn: Session
> 0x34db2f72ac50c86 for server dap87/134.41.33.87:2181, unexpected error,
> closing socket connection and attempting reconnect
> java.io.IOException: Broken pipe
> at sun.nio.ch.FileDispatcherImpl.write0(Native Method)
> at sun.nio.ch.SocketDispatcher.write(SocketDispatcher.java:47)
> at sun.nio.ch.IOUtil.writeFromNativeBuffer(IOUtil.java:94)
> at sun.nio.ch.IOUtil.write(IOUtil.java:65)
> at sun.nio.ch.SocketChannelImpl.write(SocketChannelImpl.java:450)
> at
> org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:117)
> at
> org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:355)
> at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1075)
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)