Anubhav Dhoot commented on YARN-1367:

[~jianhe] uploading a new patch that addressed some of the comments, here are 
the remaining issues

> we should have ResourceTrackerService change in this patch also to send 
> resync on non-work-preserving case and resnc_keeping_containers in 
> work-preserving case ?
Should we do it in a separate jira? This can keep the NM side changes. I can 
open one if you agree

>code should be cleaner if using separate if case 
I am trying to keep the 2 cases the same except for killing containers. Hence 
the boolean flag to distinguish that line and rest remains same without 

>testPreserveContainersOnResyncKeepingContainers -> testKeepContainersOnResync
The name explicitly indicates the event name ResyncKeepingContainers has 
different behavior than Resync

Lemme know what you think

> After restart NM should resync with the RM without killing containers
> ---------------------------------------------------------------------
>                 Key: YARN-1367
>                 URL: https://issues.apache.org/jira/browse/YARN-1367
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager
>            Reporter: Bikas Saha
>            Assignee: Anubhav Dhoot
>         Attachments: YARN-1367.001.patch, YARN-1367.002.patch, 
> YARN-1367.prototype.patch
> After RM restart, the RM sends a resync response to NMs that heartbeat to it. 
>  Upon receiving the resync response, the NM kills all containers and 
> re-registers with the RM. The NM should be changed to not kill the container 
> and instead inform the RM about all currently running containers including 
> their allocations etc. After the re-register, the NM should send all pending 
> container completions to the RM as usual.

This message was sent by Atlassian JIRA

Reply via email to