[ 
https://issues.apache.org/jira/browse/YARN-6168?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16259957#comment-16259957
 ] 

Jian He commented on YARN-6168:
-------------------------------

- AllocateResponsePBImpl#mergeLocalToBuilder needs some changes too ?
- recoveredPreviousAttemptContainers, the type can be Container type, so that 
pullPreviousAttemptContainers doesn't need to transform RMContainer to 
container.
- I think getLiveContainers and clearPreviousContainers need to be in same 
synchronization block. Otherwise, it is possible to lose the previous 
containers such as: 
1. AM acquires the live containers on register
2. containers added to live container and previous containers
3. clear previous containers 
{code}
    Collection<RMContainer> liveContainers =
        app.getCurrentAppAttempt().getLiveContainers();
    app.getCurrentAppAttempt().resetPreviousAttemptContainers();
{code}
- could you add comments in the header of 
testContainersFromPreviousAttemptsWithRMRestart to explain what the tests do, 
so that others don't need to dig into the code to understand what it does.

> Restarted RM may not inform AM about all existing containers
> ------------------------------------------------------------
>
>                 Key: YARN-6168
>                 URL: https://issues.apache.org/jira/browse/YARN-6168
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Billie Rinaldi
>            Assignee: Chandni Singh
>         Attachments: YARN-6168.001.patch
>
>
> There appears to be a race condition when an RM is restarted. I had a 
> situation where the RMs and AM were down, but NMs and app containers were 
> still running. When I restarted the RM, the AM restarted, registered with the 
> RM, and received its list of existing containers before the NMs had reported 
> all of their containers to the RM. The AM was only told about some of the 
> app's existing containers.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to