[jira] [Commented] (YARN-4459) container-executor might kill process wrongly

Jun Gong (JIRA) Tue, 24 May 2016 08:39:47 -0700

    [ 
https://issues.apache.org/jira/browse/YARN-4459?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15298361#comment-15298361
 ]


Jun Gong commented on YARN-4459:
--------------------------------

Thanks [~jlowe] for review and updating the patch!

{quote}
This could be improved upon by adding a just-before-kill check of some sort 
and/or proactive cancelling of the timer when we see the child process exit 
before the SIGKILL is sent. 
{quote}
Sorry for not adding it for a long time. I will add it in following jira if it 
is OK.

> container-executor might kill process wrongly
> ---------------------------------------------
>
>                 Key: YARN-4459
>                 URL: https://issues.apache.org/jira/browse/YARN-4459
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>            Reporter: Jun Gong
>            Assignee: Jun Gong
>         Attachments: YARN-4459.01.patch, YARN-4459.02.patch, 
> YARN-4459.03.patch
>
>
> When calling 'signal_container_as_user' in container-executor, it first 
> checks whether process group exists, if not, it will kill the process 
> itself(if it the process exists).  It is not reasonable because that the 
> process group does not exist means corresponding container has finished, if 
> we kill the process itself, we just kill wrong process.
> We found it happened in our cluster many times. We used same account for 
> starting NM and submitted app, and container-executor sometimes killed NM(the 
> wrongly killed process might just be a newly started thread and was NM's 
> child process).



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

[jira] [Commented] (YARN-4459) container-executor might kill process wrongly

Reply via email to