[
https://issues.apache.org/jira/browse/YARN-9640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Bibin A Chundatt updated YARN-9640:
-----------------------------------
Attachment: YARN-9640.002.patch
> Slow event processing could cause too many attempt unregister events
> --------------------------------------------------------------------
>
> Key: YARN-9640
> URL: https://issues.apache.org/jira/browse/YARN-9640
> Project: Hadoop YARN
> Issue Type: Bug
> Reporter: Bibin A Chundatt
> Assignee: Bibin A Chundatt
> Priority: Critical
> Labels: scalability
> Attachments: YARN-9640.001.patch, YARN-9640.002.patch
>
>
> We found in one of our test cluster verification that the number attempt
> unregister events is about 300k+.
> # AM all containers completed.
> # AMRMClientImpl send finishApplcationMaster
> # AMRMClient check event 100ms the finish Status using
> finishApplicationMaster request.
> # AMRMClientImpl#unregisterApplicationMaster
> {code:java}
> while (true) {
> FinishApplicationMasterResponse response =
> rmClient.finishApplicationMaster(request);
> if (response.getIsUnregistered()) {
> break;
> }
> LOG.info("Waiting for application to be successfully unregistered.");
> Thread.sleep(100);
> }
> {code}
> # ApplicationMasterService finishApplicationMaster interface sends
> unregister events on every status update.
> We should send unregister event only once and cache event send , ignore and
> send not unregistered response back to AM not overloading the event queue.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]