Brahma Reddy Battula created YARN-4427:
------------------------------------------
Summary: NPE on handleNMContainerStatus when NM is registering to
RM
Key: YARN-4427
URL: https://issues.apache.org/jira/browse/YARN-4427
Project: Hadoop YARN
Issue Type: Bug
Reporter: Brahma Reddy Battula
Assignee: Brahma Reddy Battula
*Seen the following in one of our environment when AM got allocated container
but failed to updated in the ZK Where cluster is having network problem for
sometime(up and down).*
{noformat}
2015-12-07 16:39:38,489 | WARN | IPC Server handler 49 on 26003 | IPC Server
handler 49 on 26003, call
org.apache.hadoop.yarn.server.api.ResourceTrackerPB.registerNodeManager from
9.91.8.220:52169 Call#17 Retry#0 | Server.java:2107
java.lang.NullPointerException
at
org.apache.hadoop.yarn.server.resourcemanager.ResourceTrackerService.handleNMContainerStatus(ResourceTrackerService.java:286)
at
org.apache.hadoop.yarn.server.resourcemanager.ResourceTrackerService.registerNodeManager(ResourceTrackerService.java:395)
at
org.apache.hadoop.yarn.server.api.impl.pb.service.ResourceTrackerPBServiceImpl.registerNodeManager(ResourceTrackerPBServiceImpl.java:54)
at
org.apache.hadoop.yarn.proto.ResourceTracker$ResourceTrackerService$2.callBlockingMethod(ResourceTracker.java:79)
at
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:616)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:973)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2088)
at org.apache.hadoop.ipc.Server$Handler$1.run(Server.java:2084)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1673)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2082)
{noformat}
Corresponding code, it might not match with {{branch-2.7/Trunk}} since we had
modified internally.
{code}
284 RMAppAttempt rmAppAttempt = rmApp.getRMAppAttempt(appAttemptId);
285 Container masterContainer = rmAppAttempt.getMasterContainer();
286 if (masterContainer.getId().equals(containerStatus.getContainerId())
287 && containerStatus.getContainerState() == ContainerState.COMPLETE) {
288 ContainerStatus status =
289 ContainerStatus.newInstance(containerStatus.getContainerId(),
290 containerStatus.getContainerState(),
containerStatus.getDiagnostics(),
291 containerStatus.getContainerExitStatus());
292 // sending master container finished event.
293 RMAppAttemptContainerFinishedEvent evt =
294 new RMAppAttemptContainerFinishedEvent(appAttemptId, status,
295 nodeId);
296 rmContext.getDispatcher().getEventHandler().handle(evt);
297 }
{code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)