[ https://issues.apache.org/jira/browse/YARN-7630?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]
Botong Huang updated YARN-7630: ------------------------------- Attachment: YARN-7630.v1.patch > Fix AMRMToken handling in AMRMProxy > ----------------------------------- > > Key: YARN-7630 > URL: https://issues.apache.org/jira/browse/YARN-7630 > Project: Hadoop YARN > Issue Type: Sub-task > Reporter: Botong Huang > Assignee: Botong Huang > Priority: Minor > Attachments: YARN-7630.v1.patch > > > Symptom: after RM rolls over the master key for AMRMToken, whenever the RPC > connection from FederationInterceptor to RM breaks due to transient network > issue and reconnects, heartbeat to RM starts failing because of the “Invalid > AMRMToken” exception. Whenever it hits, it happens for both home RM and > secondary RMs. > Related facts: > 1. When RM issues a new AMRMToken, it always send with service name field as > empty string. RPC layer in AM side will set it properly before start using > it. > 2. UGI keeps all tokens using a map from serviceName->Token. Initially > AMRMClientUtils.createRMProxy() is used to load the first token and start the > RM connection. > 3. When RM renew the token, YarnServerSecurityUtils.updateAMRMToken() is used > to load it into UGI and replace the existing token (with the same serviceName > key). > Bug: > The bug is that 2-AMRMClientUtils.createRMProxy() and > 3-YarnServerSecurityUtils.updateAMRMToken() is not handling the sequence > consistently. We always need to load the token (with empty service name) into > UGI first before we set the serviceName, so that the previous AMRMToken will > be overridden. But 2 is doing it reversely. That’s why after RM rolls the > amrmToken, the UGI end up with two tokens. Whenever the RPC connection break > and reconnect, the wrong token could be picked and thus trigger the > exception. > Fix: > Should load the AMRMToken into UGI first and then update the service name > field for RPC -- This message was sent by Atlassian JIRA (v6.4.14#64029) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org