[
https://issues.apache.org/jira/browse/YARN-5818?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15629691#comment-15629691
]
Shane Kumpf commented on YARN-5818:
-----------------------------------
Did some initial testing here and unfortunately, given that docker is a
client/server model, when the docker daemon is down for restart/upgrade, client
operations fail with an EOF exception. Our use of {{docker wait}} for
retrieving the containers exit code breaks down as the client operation
failures during the restart/upgrade.
{code}
An error occurred trying to connect: Post
http://%2Fvar%2Frun%2Fdocker.sock/v1.24/containers/c11692777816e44049d610c4ad358a24eefbff707cdbd85c24df3d153c80401e/wait:
EOF
{code}
The docker community believes this is working as intended and does not plan to
fix this behavior. It appears we will have to handle retries in c-e.
> Support the Docker Live Restore feature
> ---------------------------------------
>
> Key: YARN-5818
> URL: https://issues.apache.org/jira/browse/YARN-5818
> Project: Hadoop YARN
> Issue Type: Sub-task
> Components: yarn
> Reporter: Shane Kumpf
>
> Docker 1.12.x introduced the docker [Live
> Restore|https://docs.docker.com/engine/admin/live-restore/] feature which
> allows docker containers to survive docker daemon restarts/upgrades. Support
> for this feature should be added to YARN to allow docker changes and upgrades
> to be less impactful to existing containers.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]