Botong Huang created YARN-8673:
----------------------------------
Summary: [AMRMProxy] More robust responseId resync after an YarnRM
master slave switch
Key: YARN-8673
URL: https://issues.apache.org/jira/browse/YARN-8673
Project: Hadoop YARN
Issue Type: Task
Reporter: Botong Huang
Assignee: Botong Huang
After master slave switch of YarnRM, an _ApplicationNotRegisteredException_
will be thrown from the new YarnRM. AM will re-regsiter and reset the
responseId to zero. _AMRMClientRelayer_ inside _FederationInterceptor_ follows
the same protocol, and does the automatic re-register and responseId resync.
However, when exceptions or temporary network issue happens in the allocate
call after re-register, the resync logic might be broken. This patch improves
the robustness of the process by parsing the expected repsonseId from YarnRM
exception message. So that whenever the responseId is out of sync for whatever
reason, we can automatically resync and move on.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]