[
https://issues.apache.org/jira/browse/YARN-1321?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13802437#comment-13802437
]
Alejandro Abdelnur commented on YARN-1321:
------------------------------------------
[~vinodkv], thanks for following up ....
Agree that the AMRMClient owns the NMTokenCache. But if we remove the setter
and we have an instance of the NMTokenCache instead of the singleton, then the
NMClient will break unless you to use the constructor with the NMTokenCache you
suggest.
IMO the right thing to do is:
* AMRMClient creates an instance NMTokenCache
* AMRMClient has only a getter for the NMTokenCache
* NMClient constructor takes always an AMRMClient (to extract the NMTokenCache
from it)
But the later is an incompat change.
Thoughts?
> NMTokenCache is a a singleton, prevents multiple AMs running in a single JVM
> to work correctly
> ----------------------------------------------------------------------------------------------
>
> Key: YARN-1321
> URL: https://issues.apache.org/jira/browse/YARN-1321
> Project: Hadoop YARN
> Issue Type: Bug
> Components: client
> Affects Versions: 2.2.0
> Reporter: Alejandro Abdelnur
> Assignee: Alejandro Abdelnur
> Priority: Blocker
> Attachments: YARN-1321.patch, YARN-1321.patch, YARN-1321.patch,
> YARN-1321.patch
>
>
> NMTokenCache is a singleton. Because of this, if running multiple AMs in a
> single JVM NMTokens for the same node from different AMs step on each other
> and starting containers fail due to mismatch tokens.
> The error observed in the client side is something like:
> {code}
> ERROR org.apache.hadoop.security.UserGroupInformation:
> PriviledgedActionException as:llama (auth:PROXY) via llama (auth:SIMPLE)
> cause:org.apache.hadoop.yarn.exceptions.YarnException: Unauthorized request
> to start container.
> NMToken for application attempt : appattempt_1382038445650_0002_000001 was
> used for starting container with container token issued for application
> attempt : appattempt_1382038445650_0001_000001
> {code}
--
This message was sent by Atlassian JIRA
(v6.1#6144)