zhengchenyu created YARN-6091:
---------------------------------
Summary: the AppMaster register failed when use Docker on
LinuxContainer
Key: YARN-6091
URL: https://issues.apache.org/jira/browse/YARN-6091
Project: Hadoop YARN
Issue Type: Bug
Components: nodemanager, yarn
Affects Versions: 2.8.0
Environment: CentOS
Reporter: zhengchenyu
Priority: Critical
Fix For: 2.8.0
In some servers, When I use Docker on LinuxContainer, I found the aciton that
AppMaster register to Resourcemanager failed. But didn't happen in other
servers.
I found the pclose (in container-executor.c) return different value in
different server, even though the process which is launched by popen is running
normally. Some server return 0, and others return 13.
Because yarn regard the application as failed application when pclose return
nonzero, and yarn will remove the AMRMToken, then the AppMaster register failed
because Resourcemanager have removed this applicaiton's token.
In container-executor.c, the judgement condition is whether the return code is
zero. But man the pclose, the document tells that "pclose return -1" represent
wrong. So I change the judgement condition, then slove this problem.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]