[ 
https://issues.apache.org/jira/browse/YARN-9074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16705036#comment-16705036
 ] 

Shane Kumpf commented on YARN-9074:
-----------------------------------

{quote}In my opinion, if we want to debug, we can just rerun the docker command 
manually,
{quote}
Recreating a new container vs being able to inspect the state of the existing 
container are different in my opinion. Being able to see the state of the 
failed/exited container has value. I think we should retain support for debug 
deletion delay.
{quote}On the contrary, in most cases, if we don't need debug, it would be 
unreasonable to remove container firstly and then stop container.
{quote}
IIRC, there isn't a way to avoid the deletion task and continue to support the 
debug delay. Is there an issue that you encountered that you can share more 
detail on? For normal execution, the container will be in an exited state, 
meaning {{docker stop}} won't be called. If you'd like to add an additional 
{{docker rm}} to cleanupContainer when the debug delay is zero, I don't have a 
major concern. If that approach is taken, we'd need to understand the impact 
for relaunch, since relaunch will try to {{docker start}} the existing 
container, so a {{docker rm}} would be undesirable in this case.

> Docker container rm command should be executed after stop
> ---------------------------------------------------------
>
>                 Key: YARN-9074
>                 URL: https://issues.apache.org/jira/browse/YARN-9074
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Zhaohui Xin
>            Assignee: Zhaohui Xin
>            Priority: Major
>
> {code:java}
> @Override
> public void transition(ContainerImpl container, ContainerEvent event) {
> container.setIsReInitializing(false);
> // Set exit code to 0 on success 
> container.exitCode = 0;
> // TODO: Add containerWorkDir to the deletion service.
> if (DockerLinuxContainerRuntime.isDockerContainerRequested(
> container.daemonConf,
> container.getLaunchContext().getEnvironment())) {
> removeDockerContainer(container);
> }
> if (clCleanupRequired) {
> container.dispatcher.getEventHandler().handle(
> new ContainersLauncherEvent(container,
> ContainersLauncherEventType.CLEANUP_CONTAINER));
> }
> container.cleanup();
> }{code}
> Now, when container is finished, NM firstly execute "_docker rm xxx"_  to 
> remove it and this thread is placed in deletionService. see more in YARN-5366 
> .
> Next, NM will execute "_docker stop_" and "docker kill" command. these tow 
> commands are wrapped up in ContainerCleanup thread and executed by 
> ContainersLauncher. see more in YARN-7644. 
> The above will cause the container's cleanup to be split into two threads. I 
> think we should refactor these code to make all docker container killing 
> process be place in ContainerCleanup thread and "_docker rm_" should be 
> executed last.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to