Jun Gong created YARN-4459:
------------------------------
Summary: container-executor might kill process wrongly
Key: YARN-4459
URL: https://issues.apache.org/jira/browse/YARN-4459
Project: Hadoop YARN
Issue Type: Bug
Components: nodemanager
Reporter: Jun Gong
Assignee: Jun Gong
When calling 'signal_container_as_user' in container-executor, it first checks
whether process group exists, if not, it will kill the process itself(if it the
process exists). It is not reasonable because that the process group does not
exist means corresponding container has finished, if we kill the process
itself, we just kill wrong process.
We found it happened in our cluster many times. We used same account for
starting NM and submitted app, and container-executor sometimes killed NM(the
wrongly killed process might just be a newly started thread and was NM's child
process).
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)