Vinod Kumar Vavilapalli commented on YARN-3678:

The default delay is 250 milliseconds. So it is very hard to hit this condition.

At least when LinuxContainerExecutor is used, the kill is done as the user 
itself, so it's unlikely it will affect other users' processes.

Other than also doing a user-check to ensure its the same user's container, I 
am not sure what else can be done.

> DelayedProcessKiller may kill other process other than container
> ----------------------------------------------------------------
>                 Key: YARN-3678
>                 URL: https://issues.apache.org/jira/browse/YARN-3678
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 2.6.0
>            Reporter: gu-chi
>            Priority: Critical
> Suppose one container finished, then it will do clean up, the PID file still 
> exist and will trigger once singalContainer, this will kill the process with 
> the pid in PID file, but as container already finished, so this PID may be 
> occupied by other process, this may cause serious issue.
> As I know, my NM was killed unexpectedly, what I described can be the cause. 
> Even rarely occur.

This message was sent by Atlassian JIRA

Reply via email to