[ https://issues.apache.org/jira/browse/YARN-1922?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14195452#comment-14195452 ]
Vinod Kumar Vavilapalli commented on YARN-1922: ----------------------------------------------- Sorry, didn't look at your previous comment given the progress on other patches. So, I think we overall need to the following: {code} while (pidFile is not Present && the process has not crashed) { // loop } {code} This is same as your do {} while {} loop. +1 for your YARN-1922.5.patch. Checking this in. > Process group remains alive after container process is killed externally > ------------------------------------------------------------------------ > > Key: YARN-1922 > URL: https://issues.apache.org/jira/browse/YARN-1922 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager > Affects Versions: 2.4.0 > Environment: CentOS 6.4 > Reporter: Billie Rinaldi > Assignee: Billie Rinaldi > Attachments: YARN-1922.1.patch, YARN-1922.2.patch, YARN-1922.3.patch, > YARN-1922.4.patch, YARN-1922.5.patch, YARN-1922.6.patch > > > If the main container process is killed externally, ContainerLaunch does not > kill the rest of the process group. Before sending the event that results in > the ContainerLaunch.containerCleanup method being called, ContainerLaunch > sets the "completed" flag to true. Then when cleaning up, it doesn't try to > read the pid file if the completed flag is true. If it read the pid file, it > would proceed to send the container a kill signal. In the case of the > DefaultContainerExecutor, this would kill the process group. -- This message was sent by Atlassian JIRA (v6.3.4#6332)