[
https://issues.apache.org/jira/browse/YARN-1061?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13737990#comment-13737990
]
Rohith Sharma K S commented on YARN-1061:
-----------------------------------------
Extracted thread dump from NodeManager is
{noformat}
"Node Status Updater" prio=10 tid=0x00000000414dc000 nid=0x1d754 in
Object.wait() [0x00007fefa2dec000]
java.lang.Thread.State: WAITING (on object monitor)
at java.lang.Object.wait(Native Method)
at java.lang.Object.wait(Object.java:485)
at org.apache.hadoop.ipc.Client.call(Client.java:1231)
- locked <0x00000000def4f158> (a org.apache.hadoop.ipc.Client$Call)
at
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
at $Proxy28.nodeHeartbeat(Unknown Source)
at
org.apache.hadoop.yarn.server.api.impl.pb.client.ResourceTrackerPBClientImpl.nodeHeartbeat(ResourceTrackerPBClientImpl.java:70)
at sun.reflect.GeneratedMethodAccessor24.invoke(Unknown Source)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:25)
at java.lang.reflect.Method.invoke(Method.java:597)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:164)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:83)
at $Proxy30.nodeHeartbeat(Unknown Source)
at
org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl$1.run(NodeStatusUpdaterImpl.java:348)
{noformat}
> NodeManager is indefinitely waiting for nodeHeartBeat() response from
> ResouceManager.
> -------------------------------------------------------------------------------------
>
> Key: YARN-1061
> URL: https://issues.apache.org/jira/browse/YARN-1061
> Project: Hadoop YARN
> Issue Type: Bug
> Components: nodemanager
> Affects Versions: 2.0.5-alpha
> Reporter: Rohith Sharma K S
>
> It is observed that in one of the scenario, NodeManger is indefinetly waiting
> for nodeHeartbeat response from ResouceManger where ResouceManger is in
> hanged up state.
> NodeManager should get timeout exception instead of waiting indefinetly.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira