Varun Vasudev commented on YARN-1922:

Thanks for the patch Billie. Some comments -
+    if (Shell.WINDOWS) {
+      return;
+    }
Is it possible to change this to -
+ Assume.assumeTrue(Shell.isSetsidAvailable);
As I understand it, the functionality to kill child processes on non-Windows 
platforms requires the setsid command.

Also, you check if the parent process is killed here - 
+    Assert.assertFalse("Process is still alive!",
+        DefaultContainerExecutor.containerIsAlive(pid));
Shouldn't we also check if the child process has been killed? On my Mac for 
example(after applying your patch) -
HWxxxxx:hadoop-yarn-server-nodemanager vvasudev$ setsid
-bash: setsid: command not found
HWxxxxx:hadoop-yarn-server-nodemanager vvasudev$ mvn clean test 
 T E S T S
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 5.215 sec - in 

Results :

Tests run: 1, Failures: 0, Errors: 0, Skipped: 0

[INFO] ------------------------------------------------------------------------
[INFO] ------------------------------------------------------------------------
[INFO] Total time: 11.908s
[INFO] Finished at: Sun Nov 02 00:37:18 IST 2014
[INFO] Final Memory: 25M/306M
[INFO] ------------------------------------------------------------------------
HWxxxxx:hadoop-yarn-server-nodemanager vvasudev$ ps aux | grep bash | grep 
vvasudev        37129   0.0  0.0  2436436    536 s002  S    12:37AM   0:00.01 

> Process group remains alive after container process is killed externally
> ------------------------------------------------------------------------
>                 Key: YARN-1922
>                 URL: https://issues.apache.org/jira/browse/YARN-1922
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 2.4.0
>         Environment: CentOS 6.4
>            Reporter: Billie Rinaldi
>            Assignee: Billie Rinaldi
>         Attachments: YARN-1922.1.patch, YARN-1922.2.patch, YARN-1922.3.patch, 
> YARN-1922.4.patch
> If the main container process is killed externally, ContainerLaunch does not 
> kill the rest of the process group.  Before sending the event that results in 
> the ContainerLaunch.containerCleanup method being called, ContainerLaunch 
> sets the "completed" flag to true.  Then when cleaning up, it doesn't try to 
> read the pid file if the completed flag is true.  If it read the pid file, it 
> would proceed to send the container a kill signal.  In the case of the 
> DefaultContainerExecutor, this would kill the process group.

This message was sent by Atlassian JIRA

Reply via email to