Junping Du created YARN-4790:
--------------------------------
Summary: Per user blacklist node for user specific error for
container launch failure.
Key: YARN-4790
URL: https://issues.apache.org/jira/browse/YARN-4790
Project: Hadoop YARN
Issue Type: Bug
Components: applications
Reporter: Junping Du
Assignee: Junping Du
There are some user specific error for container launch failure, like:
when enabling LinuxContainerExecutor, but some node doesn't have such user
exists, so container launch should get failed with following information:
{noformat}
2016-02-14 15:37:03,111 INFO
org.apache.hadoop.yarn.server.resourcemanager.rmapp.attempt.RMAppAttemptImpl:
appattempt_1434045496283_0036_000002 State change from LAUNCHED to FAILED
2016-02-14 15:37:03,111 INFO
org.apache.hadoop.yarn.server.resourcemanager.rmapp.RMAppImpl: Application
application_1434045496283_0036 failed 2 times due to AM Container for
appattempt_1434045496283_0036_000002 exited with exitCode: -1000 due to:
Application application_1434045496283_0036 initialization failed (exitCode=255)
with output: User jdu not found
{noformat}
Obviously, this node is not suitable for launching container for this user's
other applications. We need a per user blacklist track mechanism rather than
per application now.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)