[ 
https://issues.apache.org/jira/browse/YARN-3066?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14279287#comment-14279287
 ] 

Dmitry Sivachenko commented on YARN-3066:
-----------------------------------------

Windows case is tested separately, see private static boolean 
isSetsidSupported() in
hadoop-common-project/hadoop-common/src/main/java/org/apache/hadoop/util/Shel
l.java

for instance:

if (Shell.WINDOWS) {
      return false;
}

In any UNIX-like case I suppose it will leave orphaned processes, because if 
isSetsidSupported()==false it uses kill(pid) to kill task instead of kill(pgid) 
to kill the whole process group.

ssid(1) in FreeBSD  is the analog setsid(1) in Linux: userland wrapper for 
setsid() system call.

Renaming does not sound as sane idea, because it is hard to convince all people 
to do rename of installed binaries by hand.

I propose to treat it like system-dependent option and act accordingly.

(I suppose other OS's like Solaris also lack setsid(1) utility so they could 
also benefit).

For ssid source see http://tools.suckless.org/ssid/

As for backwards compatibility we can change that in 3.0, it is not fatal, 
failure to start without setsid will just remind users to install setsid() or 
ssid() and proceed futher, and be sure that there will be no side effects like 
orphaned tasks eating CPU.

> Hadoop leaves orphaned tasks running after job is killed
> --------------------------------------------------------
>
>                 Key: YARN-3066
>                 URL: https://issues.apache.org/jira/browse/YARN-3066
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>         Environment: Hadoop 2.4.1 (probably all later too), FreeBSD-10.1
>            Reporter: Dmitry Sivachenko
>
> When spawning user task, node manager checks for setsid(1) utility and spawns 
> task program via it. See 
> hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/main/java/org/apache/hadoop/yarn/server/nodemanager/DefaultContainerExecutor.java
>  for instance:
> String exec = Shell.isSetsidAvailable? "exec setsid" : "exec";
> FreeBSD, unlike Linux, does not have setsid(1) utility.  So plain "exec" is 
> used to spawn user task.  If that task spawns other external programs (this 
> is common case if a task program is a shell script) and user kills job via 
> mapred job -kill <Job>, these child processes remain running.
> 1) Why do you silently ignore the absence of setsid(1) and spawn task process 
> via exec: this is the guarantee to have orphaned processes when job is 
> prematurely killed.
> 2) FreeBSD has a replacement third-party program called ssid (which does 
> almost the same as Linux's setsid).  It would be nice to detect which binary 
> is present during configure stage and put @SETSID@ macros into java file to 
> use the correct name.
> I propose to make Shell.isSetsidAvailable test more strict and fail to start 
> if it is not found:  at least we will know about the problem at start rather 
> than guess why there are orphaned tasks running forever.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to