[
https://issues.apache.org/jira/browse/YARN-8673?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16586410#comment-16586410
]
Hudson commented on YARN-8673:
------------------------------
SUCCESS: Integrated in Jenkins build Hadoop-trunk-Commit #14805 (See
[https://builds.apache.org/job/Hadoop-trunk-Commit/14805/])
YARN-8673. [AMRMProxy] More robust responseId resync after an YarnRM (gifuma:
rev 8736fc39ac3b3de168d2c216f3d1c0edb48fb3f9)
* (edit)
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/client/AMRMClientUtils.java
* (edit)
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/AMRMClientRelayer.java
* (edit)
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/amrmproxy/FederationInterceptor.java
* (edit)
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/uam/UnmanagedApplicationManager.java
* (edit)
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/main/java/org/apache/hadoop/yarn/server/resourcemanager/ApplicationMasterService.java
* (edit)
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/test/java/org/apache/hadoop/yarn/server/TestAMRMClientRelayer.java
> [AMRMProxy] More robust responseId resync after an YarnRM master slave switch
> -----------------------------------------------------------------------------
>
> Key: YARN-8673
> URL: https://issues.apache.org/jira/browse/YARN-8673
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: amrmproxy
> Reporter: Botong Huang
> Assignee: Botong Huang
> Priority: Major
> Attachments: YARN-8673.v1.patch, YARN-8673.v2.patch
>
>
> After master slave switch of YarnRM, an _ApplicationNotRegisteredException_
> will be thrown from the new YarnRM. AM will re-regsiter and reset the
> responseId to zero. _AMRMClientRelayer_ inside _FederationInterceptor_
> follows the same protocol, and does the automatic re-register and responseId
> resync. However, when exceptions or temporary network issue happens in the
> allocate call after re-register, the resync logic might be broken. This patch
> improves the robustness of the process by parsing the expected repsonseId
> from YarnRM exception message. So that whenever the responseId is out of sync
> for whatever reason, we can automatically resync and move on.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]