[ 
https://issues.apache.org/jira/browse/YARN-9640?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rohith Sharma K S updated YARN-9640:
------------------------------------
    Fix Version/s: 3.3.0

> Slow event processing could cause too many attempt unregister events
> --------------------------------------------------------------------
>
>                 Key: YARN-9640
>                 URL: https://issues.apache.org/jira/browse/YARN-9640
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Bibin A Chundatt
>            Assignee: Bibin A Chundatt
>            Priority: Critical
>              Labels: scalability
>             Fix For: 3.3.0
>
>         Attachments: YARN-9640.001.patch, YARN-9640.002.patch, 
> YARN-9640.003.patch
>
>
> We found in one of our test cluster verification that the number attempt 
> unregister events is about 300k+.
>  # AM all containers completed.
>  # AMRMClientImpl send finishApplcationMaster
>  # AMRMClient check event 100ms the finish Status using 
> finishApplicationMaster request.
>  # AMRMClientImpl#unregisterApplicationMaster
> {code:java}
>       while (true) {
>         FinishApplicationMasterResponse response =
>             rmClient.finishApplicationMaster(request);
>         if (response.getIsUnregistered()) {
>           break;
>         }
>         LOG.info("Waiting for application to be successfully unregistered.");
>         Thread.sleep(100);
>       }
> {code}
>  # ApplicationMasterService finishApplicationMaster interface sends 
> unregister events on every status update.
> We should send unregister event only once and cache event send , ignore and 
> send not unregistered response back to AM not overloading the event queue.



--
This message was sent by Atlassian Jira
(v8.3.2#803003)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to