Adam Antal created YARN-9923:
--------------------------------
Summary: Detect missing Docker binary or not running Docker daemon
Key: YARN-9923
URL: https://issues.apache.org/jira/browse/YARN-9923
Project: Hadoop YARN
Issue Type: New Feature
Components: nodemanager, yarn
Affects Versions: 3.2.1
Reporter: Adam Antal
Assignee: Adam Antal
Currently if a NodeManager is enabled to allocate Docker containers, but the
specified binary (docker.binary in the container-executor.cfg) is missing the
container allocation fails with the following error message:
{noformat}
Container launch fails
Exit code: 29
Exception message: Launch container failed
Shell error output: sh: <docker binary path, /usr/bin/docker by default>: No
such file or directory
Could not inspect docker network to get type /usr/bin/docker network inspect
host --format='{{.Driver}}'.
Error constructing docker command, docker error code=-1, error message='Unknown
error'
{noformat}
I suggest to add a property say "yarn.nodemanager.runtime.linux.docker.check"
to have the following options:
- STARTUP: setting this option the NodeManager would not start if Docker
binaries are missing or the Docker daemon is not running (the exception is
considered FATAL during startup)
- RUNTIME: would give a more detailed/user-friendly exception in NodeManager's
side (NM logs) if Docker binaries are missing or the daemon is not working.
This would also prevent further Docker container allocation as long as the
binaries do not exist and the docker daemon is not running.
- NONE (default): preserving the current behaviour, throwing exception during
container allocation, carrying on using the default retry procedure.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]