Vinod Kumar Vavilapalli commented on YARN-1367:

Sorry for jumping in late, just looked at this. It doesn't make much sense to 
have NodeAction.RESYNC and RESYNC_KEEPING_CONTAINERS. We should have only 

Long term, we will only support work-preserving RM restart. For the interim, we 
can just make NodeManagers look at the work-preserving-rm-restart flag and then 
decide how to act on a RESYNC.

> After restart NM should resync with the RM without killing containers
> ---------------------------------------------------------------------
>                 Key: YARN-1367
>                 URL: https://issues.apache.org/jira/browse/YARN-1367
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager
>            Reporter: Bikas Saha
>            Assignee: Anubhav Dhoot
>         Attachments: YARN-1367.001.patch, YARN-1367.002.patch, 
> YARN-1367.prototype.patch
> After RM restart, the RM sends a resync response to NMs that heartbeat to it. 
>  Upon receiving the resync response, the NM kills all containers and 
> re-registers with the RM. The NM should be changed to not kill the container 
> and instead inform the RM about all currently running containers including 
> their allocations etc. After the re-register, the NM should send all pending 
> container completions to the RM as usual.

This message was sent by Atlassian JIRA

Reply via email to