[ 
https://issues.apache.org/jira/browse/YARN-8473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16531690#comment-16531690
 ] 

Sunil Govindan commented on YARN-8473:
--------------------------------------

Thanks [~jlowe] for analyzing this and sharing patch. I have one doubt in the 
patch.

In the default case, now a ContainerKillEvent is raised mentioning app is not 
running and hence killing container. In which case, container can come to this 
case? I think a common error handling is much safer here to avoid having some 
orphaned containers however could we add also some error logs which prints 
containerid, states etc to help to debug such cases more.

> Containers being launched as app tears down can leave containers in NEW state
> -----------------------------------------------------------------------------
>
>                 Key: YARN-8473
>                 URL: https://issues.apache.org/jira/browse/YARN-8473
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 2.8.4
>            Reporter: Jason Lowe
>            Assignee: Jason Lowe
>            Priority: Major
>         Attachments: YARN-8473.001.patch, YARN-8473.002.patch
>
>
> I saw a case where containers were stuck on a nodemanager in the NEW state 
> because they tried to launch just as an application was tearing down.  The 
> container sent an INIT_CONTAINER event to the ApplicationImpl which then 
> executed an invalid transition since that event is not handled/expected when 
> the application is in the process of tearing down.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to