[
https://issues.apache.org/jira/browse/YARN-579?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Daryn Sharp reopened YARN-579:
------------------------------
This has broken secure clusters. The AM is unable to find the token to
register with the RM. I've debugged it far enough to see that localization has
put the token in the nm-private dir, so it looks like the AM has amnesia when
it connects to the RM.
{noformat}
2013-04-29 17:47:02,666 DEBUG [IPC Client (4914628) connection to $RM:8030 from
$USER] org.apache.hadoop.ipc.Client: IPC Client (4914628) connection to
$RM:8030 from $USER: stopped, remaining connections 1
2013-04-29 17:47:02,667 ERROR [main]
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator: Exception while
registering
java.lang.reflect.UndeclaredThrowableException
at
org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl.unwrapAndThrowException(YarnRemoteExceptionPBImpl.java:128)
at
org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.registerApplicationMaster(AMRMProtocolPBClientImpl.java:103)
at
org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.register(RMCommunicator.java:153)
at
org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.start(RMCommunicator.java:112)
at
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.start(RMContainerAllocator.java:211)
at
org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerAllocatorRouter.start(MRAppMaster.java:797)
at
org.apache.hadoop.yarn.service.CompositeService.start(CompositeService.java:68)
at
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.start(MRAppMaster.java:1014)
at
org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.run(MRAppMaster.java:1369)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1477)
at
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1365)
at
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1318)
Caused by:
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException):
SIMPLE authentication is not enabled. Available:[KERBEROS, DIGEST]
at org.apache.hadoop.ipc.Client.call(Client.java:1229)
at
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
at com.sun.proxy.$Proxy28.registerApplicationMaster(Unknown Source)
at
org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.registerApplicationMaster(AMRMProtocolPBClientImpl.java:100)
... 12 more
2013-04-29 17:47:02,668 ERROR [main]
org.apache.hadoop.yarn.service.CompositeService: Error starting services
org.apache.hadoop.mapreduce.v2.app.MRAppMaster
org.apache.hadoop.yarn.YarnException:
java.lang.reflect.UndeclaredThrowableException
at
org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.register(RMCommunicator.java:166)
at
org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.start(RMCommunicator.java:112)
at
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.start(RMContainerAllocator.java:211)
at
org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerAllocatorRouter.start(MRAppMaster.java:797)
at
org.apache.hadoop.yarn.service.CompositeService.start(CompositeService.java:68)
at
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.start(MRAppMaster.java:1014)
at
org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.run(MRAppMaster.java:1369)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1477)
at
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1365)
at
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1318)
Caused by: java.lang.reflect.UndeclaredThrowableException
at
org.apache.hadoop.yarn.exceptions.impl.pb.YarnRemoteExceptionPBImpl.unwrapAndThrowException(YarnRemoteExceptionPBImpl.java:128)
at
org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.registerApplicationMaster(AMRMProtocolPBClientImpl.java:103)
at
org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.register(RMCommunicator.java:153)
... 11 more
Caused by:
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.AccessControlException):
SIMPLE authentication is not enabled. Available:[KERBEROS, DIGEST]
at org.apache.hadoop.ipc.Client.call(Client.java:1229)
at
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:202)
at com.sun.proxy.$Proxy28.registerApplicationMaster(Unknown Source)
at
org.apache.hadoop.yarn.api.impl.pb.client.AMRMProtocolPBClientImpl.registerApplicationMaster(AMRMProtocolPBClientImpl.java:100)
... 12 more
{noformat}
Which then leads to a NPE:
{noformat}
2013-04-29 17:47:02,668 INFO [main]
org.apache.hadoop.yarn.service.CompositeService: Error stopping
org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerAllocatorRouter
java.lang.NullPointerException
at
org.apache.hadoop.mapreduce.v2.app.rm.RMCommunicator.stop(RMCommunicator.java:219)
at
org.apache.hadoop.mapreduce.v2.app.rm.RMContainerAllocator.stop(RMContainerAllocator.java:251)
at
org.apache.hadoop.mapreduce.v2.app.MRAppMaster$ContainerAllocatorRouter.stop(MRAppMaster.java:803)
at
org.apache.hadoop.yarn.service.CompositeService.stop(CompositeService.java:99)
at
org.apache.hadoop.yarn.service.CompositeService.start(CompositeService.java:77)
at
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.start(MRAppMaster.java:1014)
at
org.apache.hadoop.mapreduce.v2.app.MRAppMaster$1.run(MRAppMaster.java:1369)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:415)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1477)
at
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.initAndStartAppMaster(MRAppMaster.java:1365)
at
org.apache.hadoop.mapreduce.v2.app.MRAppMaster.main(MRAppMaster.java:1318)
{noformat}
> Make ApplicationToken part of Container's token list to help RM-restart
> -----------------------------------------------------------------------
>
> Key: YARN-579
> URL: https://issues.apache.org/jira/browse/YARN-579
> Project: Hadoop YARN
> Issue Type: Sub-task
> Affects Versions: 2.0.4-alpha
> Reporter: Vinod Kumar Vavilapalli
> Assignee: Vinod Kumar Vavilapalli
> Fix For: 2.0.5-beta
>
> Attachments: YARN-579-20130422.1.txt,
> YARN-579-20130422.1_YARNChanges.txt
>
>
> Container is already persisted for helping RM restart. Instead of explicitly
> setting ApplicationToken in AM's env, if we change it to be in Container, we
> can avoid env and can also help restart.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira