[jira] [Commented] (YARN-1490) RM should optionally not kill all containers when an ApplicationMaster exits

Jian He (JIRA) Sat, 04 Jan 2014 13:18:22 -0800

    [ 
https://issues.apache.org/jira/browse/YARN-1490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13862419#comment-13862419
 ]


Jian He commented on YARN-1490:
-------------------------------

bq. We are better off changing the dispatcher related logic to look up the 
appId of the container, get the current attempt of that appId and then route 
the event to the current event
Thought about this, this can lead to the race that the new attempt is not yet 
created in the schedule when AM is restarting, the scheduler is still pointing 
to the previous died attempt, then the container events  are going to be sent 
the previous died attempt.

> RM should optionally not kill all containers when an ApplicationMaster exits
> ----------------------------------------------------------------------------
>
>                 Key: YARN-1490
>                 URL: https://issues.apache.org/jira/browse/YARN-1490
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Vinod Kumar Vavilapalli
>            Assignee: Jian He
>         Attachments: YARN-1490.1.patch, YARN-1490.2.patch, YARN-1490.3.patch
>
>
> This is needed to enable work-preserving AM restart. Some apps can chose to 
> reconnect with old running containers, some may not want to. This should be 
> an option.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

[jira] [Commented] (YARN-1490) RM should optionally not kill all containers when an ApplicationMaster exits

Reply via email to