[
https://issues.apache.org/jira/browse/YARN-7630?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16291714#comment-16291714
]
Botong Huang commented on YARN-7630:
------------------------------------
Cool, thanks [~asuresh] and [~subru]!
> Fix AMRMToken rollover handling in AMRMProxy
> --------------------------------------------
>
> Key: YARN-7630
> URL: https://issues.apache.org/jira/browse/YARN-7630
> Project: Hadoop YARN
> Issue Type: Sub-task
> Reporter: Botong Huang
> Assignee: Botong Huang
> Priority: Minor
> Fix For: 3.1.0, 2.10.0, 2.9.1
>
> Attachments: YARN-7630.v1.patch, YARN-7630.v1.patch
>
>
> Symptom: after RM rolls over the master key for AMRMToken, whenever the RPC
> connection from FederationInterceptor to RM breaks due to transient network
> issue and reconnects, heartbeat to RM starts failing because of the “Invalid
> AMRMToken” exception. Whenever it hits, it happens for both home RM and
> secondary RMs.
> Related facts:
> 1. When RM issues a new AMRMToken, it always send with service name field as
> empty string. RPC layer in AM side will set it properly before start using
> it.
> 2. UGI keeps all tokens using a map from serviceName->Token. Initially
> AMRMClientUtils.createRMProxy() is used to load the first token and start the
> RM connection.
> 3. When RM renew the token, YarnServerSecurityUtils.updateAMRMToken() is used
> to load it into UGI and replace the existing token (with the same serviceName
> key).
> Bug:
> The bug is that 2-AMRMClientUtils.createRMProxy() and
> 3-YarnServerSecurityUtils.updateAMRMToken() is not handling the sequence
> consistently. We always need to load the token (with empty service name) into
> UGI first before we set the serviceName, so that the previous AMRMToken will
> be overridden. But 2 is doing it reversely. That’s why after RM rolls the
> amrmToken, the UGI end up with two tokens. Whenever the RPC connection break
> and reconnect, the wrong token could be picked and thus trigger the
> exception.
> Fix:
> Should load the AMRMToken into UGI first and then update the service name
> field for RPC
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]