[ 
https://issues.apache.org/jira/browse/YARN-7999?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16397344#comment-16397344
 ] 

Eric Yang edited comment on YARN-7999 at 3/13/18 5:38 PM:
----------------------------------------------------------

[~jlowe] This is non entry_point docker image with clean checkout of trunk 
code.  There is no shared storage between nodes.  I turned off docker daemon on 
one of the node and this problem surfaced.  However, I think my earlier 
analysis is incorrect.  The directory problem also exists in node that have 
functional docker.  However, it appears to be a race condition between log 
directory creation and docker run.  Please see the attached log file (q3.log).


was (Author: eyang):
[~jlowe] This is non entry_point docker image with clean checkout of trunk 
code.  There is no shared storage between nodes.  I turned off docker daemon on 
one of the node and this problem surfaced.  However, I think my earlier 
analysis is incorrect.  The directory problem also exists in node that have 
functional docker.  However, it appears to be a race condition between log 
directory creation and docker run.  Please see the attached log file.

> Docker launch fails when user private filecache directory is missing
> --------------------------------------------------------------------
>
>                 Key: YARN-7999
>                 URL: https://issues.apache.org/jira/browse/YARN-7999
>             Project: Hadoop YARN
>          Issue Type: Bug
>    Affects Versions: 3.1.0
>            Reporter: Eric Yang
>            Assignee: Jason Lowe
>            Priority: Major
>         Attachments: YARN-7999.001.patch, YARN-7999.002.patch
>
>
> Docker container is failing to launch in trunk.  The root cause is:
> {code}
> [COMPINSTANCE sleeper-1 : container_1520032931921_0001_01_000020]: 
> [2018-03-02 23:26:09.196]Exception from container-launch.
> Container id: container_1520032931921_0001_01_000020
> Exit code: 29
> Exception message: image: hadoop/centos:latest is trusted in hadoop registry.
> Could not determine real path of mount 
> '/tmp/hadoop-yarn/nm-local-dir/usercache/hbase/filecache'
> Could not determine real path of mount 
> '/tmp/hadoop-yarn/nm-local-dir/usercache/hbase/filecache'
> Invalid docker mount 
> '/tmp/hadoop-yarn/nm-local-dir/usercache/hbase/filecache:/tmp/hadoop-yarn/nm-local-dir/usercache/hbase/filecache',
>  realpath=/tmp/hadoop-yarn/nm-local-dir/usercache/hbase/filecache
> Error constructing docker command, docker error code=12, error 
> message='Invalid docker mount'
> Shell output: main : command provided 4
> main : run as user is hbase
> main : requested yarn user is hbase
> Creating script paths...
> Creating local dirs...
> [2018-03-02 23:26:09.240]Diagnostic message from attempt 0 : [2018-03-02 
> 23:26:09.240]
> [2018-03-02 23:26:09.240]Container exited with a non-zero exit code 29.
> [2018-03-02 23:26:39.278]Could not find 
> nmPrivate/application_1520032931921_0001/container_1520032931921_0001_01_000020//container_1520032931921_0001_01_000020.pid
>  in any of the directories
> [COMPONENT sleeper]: Failed 11 times, exceeded the limit - 10. Shutting down 
> now...
> {code}
> The filecache cant not be mounted because it doesn't exist.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to