[ 
https://issues.apache.org/jira/browse/YARN-1490?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13884366#comment-13884366
 ] 

Steve Loughran commented on YARN-1490:
--------------------------------------

I'm just going to add some AM implementation notes for anyone using this feature
# you can get AM container exit events as soon as the AM is registered -so the 
registration & rebuilding state process should be synchronized on something 
that the callback also blocks on.
# you will get notified of any containers that have exited after the previous 
AM failure -and you won't know what the containers are as they weren't in that 
list supplied. At the very least, ignore these.

> RM should optionally not kill all containers when an ApplicationMaster exits
> ----------------------------------------------------------------------------
>
>                 Key: YARN-1490
>                 URL: https://issues.apache.org/jira/browse/YARN-1490
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Vinod Kumar Vavilapalli
>            Assignee: Jian He
>             Fix For: 2.4.0
>
>         Attachments: YARN-1490.1.patch, YARN-1490.10.patch, 
> YARN-1490.11.patch, YARN-1490.11.patch, YARN-1490.12.patch, 
> YARN-1490.2.patch, YARN-1490.3.patch, YARN-1490.4.patch, YARN-1490.5.patch, 
> YARN-1490.6.patch, YARN-1490.7.patch, YARN-1490.8.patch, YARN-1490.9.patch
>
>
> This is needed to enable work-preserving AM restart. Some apps can chose to 
> reconnect with old running containers, some may not want to. This should be 
> an option.



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to