[ 
https://issues.apache.org/jira/browse/YARN-5620?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15472777#comment-15472777
 ] 

Jian He commented on YARN-5620:
-------------------------------

[~asuresh], thanks for the explanation
bq. Only the AM knows if the upgrade is actually successful.
How does AM determine whether the upgrade is successful (like what kind signal 
should AM depend on)? I feel once the container starts running, even for AM, 
it's hard to distinguish whether the failure is caused by upgrade or runtime.  
IMO, if container fails to launch on upgrade, it should be considered as 
upgrade failure. Once the container starts running, if the container fails, it 
can be considered as runtime failure. If user does want to rollback, user call 
the upgardeContainer/rollback command again to roll back. 
bq.  But, in my opinion rollback should not be provided with an explicit 
launchContext, it should always be the just previous context.
I also agree AM can take care of tying the context with version. In our case, 
the slider AM (also Yarn code) will have the prior context and call the 
upgardeContainer with the corresponding context, and so NM does not need to 
remember prior context.

I think for upgrade itself, it is enough work for a single jira with enough 
corner cases. Could you separate the patch to include only the upgrade piece in 
this jira ? that also makes review easier..



> Core changes in NodeManager to support for upgrade and rollback of Containers
> -----------------------------------------------------------------------------
>
>                 Key: YARN-5620
>                 URL: https://issues.apache.org/jira/browse/YARN-5620
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Arun Suresh
>            Assignee: Arun Suresh
>         Attachments: YARN-5620.001.patch, YARN-5620.002.patch, 
> YARN-5620.003.patch
>
>
> JIRA proposes to modify the ContainerManager (and other core classes) to 
> support upgrade of a running container with a new {{ContainerLaunchContext}} 
> as well as the ability to rollback the upgrade if the container is not able 
> to restart using the new launch Context. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to