Jason Lowe commented on YARN-3103:
On second thought, maybe the client doesn't need to know the service name the
RM used. The RM is already sending an updated token _that the RM generated_ to
the AM. If the AM blindly stuffs it into the credentials _before_ it tries to
fixup the token then it will use whatever service name the RM left on the
token. As long as that service name matches the one the RM put in originally
(and ideally it's not going to collide with any other token) then we know it
will clobber the old AMRM token as intended. Then the client can fixup the
token service name _after_ it's been stored in the credentials, just like it
does during AM startup.
So we just need the AM to generate something that will not collide with
non-AMRM tokens and also not collide with tokens from other cluster RMs.
Cluster ID is tempting, but if the AM is talking to two, non-HA clusters then
I'm not sure we know the user bothered to configure the cluster ID. However I
think we _have_ to use the cluster ID otherwise two RMs in the same HA-enabled
cluster could generate different service names which breaks things. So I think
the cluster ID is our best bet, with the caveat that if an AM needs to wield
multiple AMRM tokens then all clusters involved need to have unique cluster IDs
> AMRMClientImpl does not update AMRM token properly
> Key: YARN-3103
> URL: https://issues.apache.org/jira/browse/YARN-3103
> Project: Hadoop YARN
> Issue Type: Bug
> Components: client
> Affects Versions: 2.6.0
> Reporter: Jason Lowe
> Assignee: Jason Lowe
> Priority: Blocker
> AMRMClientImpl.updateAMRMToken updates the token service _before_ storing it
> to the credentials, so the token is mapped using the newly updated service
> rather than the empty service that was used when the RM created the original
> AMRM token. This leads to two AMRM tokens in the credentials and can still
> fail if the AMRMTokenSelector picks the wrong one.
> In addition the AMRMClientImpl grabs the login user rather than the current
> user when security is enabled, so it's likely the UGI being updated is not
> the UGI that will be used when reconnecting to the RM.
> The end result is that AMs can fail with invalid token errors when trying to
> reconnect to an RM after a new AMRM secret has been activated.
This message was sent by Atlassian JIRA