Jian He commented on YARN-1367:

bq. Should we do it in a separate jira? This can keep the NM side changes.
Given that should be only 5 lines extra changes, can we just include it here? 
Splitting patches like this is kind of splitting the context. it'll be easier 
to follow for other new people if having both.

my previous comment: we should rename testContainerPreservationOnResyncImpl 
also as it’s used in both killContainer and keepContainer test case
since this method is used in testKillContainersOnResync also, we may rename 
this like testNMResyncImpl ?

> After restart NM should resync with the RM without killing containers
> ---------------------------------------------------------------------
>                 Key: YARN-1367
>                 URL: https://issues.apache.org/jira/browse/YARN-1367
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager
>            Reporter: Bikas Saha
>            Assignee: Anubhav Dhoot
>         Attachments: YARN-1367.001.patch, YARN-1367.002.patch, 
> YARN-1367.prototype.patch
> After RM restart, the RM sends a resync response to NMs that heartbeat to it. 
>  Upon receiving the resync response, the NM kills all containers and 
> re-registers with the RM. The NM should be changed to not kill the container 
> and instead inform the RM about all currently running containers including 
> their allocations etc. After the re-register, the NM should send all pending 
> container completions to the RM as usual.

This message was sent by Atlassian JIRA

Reply via email to