[ https://issues.apache.org/jira/browse/YARN-8473?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16531690#comment-16531690 ]
Sunil Govindan commented on YARN-8473: -------------------------------------- Thanks [~jlowe] for analyzing this and sharing patch. I have one doubt in the patch. In the default case, now a ContainerKillEvent is raised mentioning app is not running and hence killing container. In which case, container can come to this case? I think a common error handling is much safer here to avoid having some orphaned containers however could we add also some error logs which prints containerid, states etc to help to debug such cases more. > Containers being launched as app tears down can leave containers in NEW state > ----------------------------------------------------------------------------- > > Key: YARN-8473 > URL: https://issues.apache.org/jira/browse/YARN-8473 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager > Affects Versions: 2.8.4 > Reporter: Jason Lowe > Assignee: Jason Lowe > Priority: Major > Attachments: YARN-8473.001.patch, YARN-8473.002.patch > > > I saw a case where containers were stuck on a nodemanager in the NEW state > because they tried to launch just as an application was tearing down. The > container sent an INIT_CONTAINER event to the ApplicationImpl which then > executed an invalid transition since that event is not handled/expected when > the application is in the process of tearing down. -- This message was sent by Atlassian JIRA (v7.6.3#76005) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org