[
https://issues.apache.org/jira/browse/YARN-9486?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Eric Yang updated YARN-9486:
----------------------------
Attachment: YARN-9486.003.patch
> Docker container exited with failure does not get clean up correctly
> --------------------------------------------------------------------
>
> Key: YARN-9486
> URL: https://issues.apache.org/jira/browse/YARN-9486
> Project: Hadoop YARN
> Issue Type: Sub-task
> Affects Versions: 3.2.0
> Reporter: Eric Yang
> Assignee: Eric Yang
> Priority: Major
> Attachments: YARN-9486.001.patch, YARN-9486.002.patch,
> YARN-9486.003.patch
>
>
> When docker container encounters error and exit prematurely
> (EXITED_WITH_FAILURE), ContainerCleanup does not remove container, instead we
> get messages that look like this:
> {code}
> java.io.IOException: Could not find
> nmPrivate/application_1555111445937_0008/container_1555111445937_0008_01_000007//container_1555111445937_0008_01_000007.pid
> in any of the directories
> 2019-04-15 20:42:16,454 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl:
> Container container_1555111445937_0008_01_000007 transitioned from
> RELAUNCHING to EXITED_WITH_FAILURE
> 2019-04-15 20:42:16,455 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerCleanup:
> Cleaning up container container_1555111445937_0008_01_000007
> 2019-04-15 20:42:16,455 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerCleanup:
> Container container_1555111445937_0008_01_000007 not launched. No cleanup
> needed to be done
> 2019-04-15 20:42:16,455 WARN
> org.apache.hadoop.yarn.server.nodemanager.NMAuditLogger: USER=hbase
> OPERATION=Container Finished - Failed TARGET=ContainerImpl
> RESULT=FAILURE DESCRIPTION=Container failed with state: EXITED_WITH_FAILURE
> APPID=application_1555111445937_0008
> CONTAINERID=container_1555111445937_0008_01_000007
> 2019-04-15 20:42:16,458 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.container.ContainerImpl:
> Container container_1555111445937_0008_01_000007 transitioned from
> EXITED_WITH_FAILURE to DONE
> 2019-04-15 20:42:16,458 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.application.ApplicationImpl:
> Removing container_1555111445937_0008_01_000007 from application
> application_1555111445937_0008
> 2019-04-15 20:42:16,458 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.monitor.ContainersMonitorImpl:
> Stopping resource-monitoring for container_1555111445937_0008_01_000007
> 2019-04-15 20:42:16,458 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl:
> Considering container container_1555111445937_0008_01_000007 for
> log-aggregation
> 2019-04-15 20:42:16,804 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl:
> Getting container-status for container_1555111445937_0008_01_000007
> 2019-04-15 20:42:16,804 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl:
> Getting localization status for container_1555111445937_0008_01_000007
> 2019-04-15 20:42:16,804 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl:
> Returning ContainerStatus: [ContainerId:
> container_1555111445937_0008_01_000007, ExecutionType: GUARANTEED, State:
> COMPLETE, Capability: <memory:1024, vCores:1>, Diagnostics: ..., ExitStatus:
> -1, IP: null, Host: null, ExposedPorts: , ContainerSubState: DONE]
> 2019-04-15 20:42:18,464 INFO
> org.apache.hadoop.yarn.server.nodemanager.NodeStatusUpdaterImpl: Removed
> completed containers from NM context: [container_1555111445937_0008_01_000007]
> 2019-04-15 20:43:50,476 INFO
> org.apache.hadoop.yarn.server.nodemanager.containermanager.ContainerManagerImpl:
> Stopping container with container Id: container_1555111445937_0008_01_000007
> {code}
> There is no docker rm command performed.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]