Bibin A Chundatt created YARN-6803:
--------------------------------------

             Summary: AM registration could fail if event processing is delayed.
                 Key: YARN-6803
                 URL: https://issues.apache.org/jira/browse/YARN-6803
             Project: Hadoop YARN
          Issue Type: Bug
            Reporter: Bibin A Chundatt
            Assignee: Bibin A Chundatt
            Priority: Critical


Steps to reproduce

#  Submit application
#  Delay application attempt AMLauch event processing
#  Make AM register before AM Launch event is fired

{{DefaultAMSProcessor#registerApplicationMaster}} client token 
{code}
    if (UserGroupInformation.isSecurityEnabled()) {
      LOG.info("Setting client token master key");
      response.setClientToAMTokenMasterKey(java.nio.ByteBuffer.wrap(
          getRmContext().getClientToAMTokenSecretManager()
          .getMasterKey(applicationAttemptId).getEncoded()));
    }
{code}
{code}
org.apache.hadoop.yarn.exceptions.YarnRuntimeException: 
java.lang.NullPointerException: java.lang.NullPointerException
        at 
org.apache.hadoop.yarn.server.resourcemanager.DefaultAMSProcessor.registerApplicationMaster(DefaultAMSProcessor.java:130)
        at 
org.apache.hadoop.yarn.server.resourcemanager.ApplicationMasterService.registerApplicationMaster(ApplicationMasterService.java:217)
        at 
org.apache.hadoop.yarn.api.impl.pb.service.ApplicationMasterProtocolPBServiceImpl.registerApplicationMaster(ApplicationMasterProtocolPBServiceImpl.java:90)
        at 
org.apache.hadoop.yarn.proto.ApplicationMasterProtocol$ApplicationMasterProtocolService$2.callBlockingMethod(ApplicationMasterProtocol.java:95)
        at 
org.apache.hadoop.ipc.ProtobufRpcEngine$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine.java:522)
        at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:991)
        at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:869)
        at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:815)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1965)
        at org.apache.hadoop.ipc.Server$Handler.run(Server.java:2675)

        at 
org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.register(RMCommunicator.java:177)
        at 
org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.serviceStart(RMCommunicator.java:121)
        at 
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.serviceStart(RMContainerAllocator.java:280)
        at 
org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
        at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerAllocatorRouter.serviceStart(MRAppMaster.java:978)
        at 
org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
        at 
org.apache.hadoop.service.CompositeService.serviceStart(CompositeService.java:120)
        at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.serviceStart(MRAppMaster.java:1280)
        at 
org.apache.hadoop.service.AbstractService.start(AbstractService.java:193)
        at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster$6.run(MRAppMaster.java:1733)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:422)
        at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1965)
        at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1729)
        at 
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1660)
{code}

*Root Cause*
{{ClientToAMTokenSecretManagerInRM}} token master key is set only after AMLauch 
event is fired.

{{AMLaunchedTransition}}
{code}
      // register the ClientTokenMasterKey after it is saved in the store,
      // otherwise client may hold an invalid ClientToken after RM restarts.
      if (UserGroupInformation.isSecurityEnabled()) {
        appAttempt.rmContext.getClientToAMTokenSecretManager()
            .registerApplication(appAttempt.getAppAttemptId(),
            appAttempt.getClientTokenMasterKey());
      }
{code}



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to