[ 
https://issues.apache.org/jira/browse/YARN-9074?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16740577#comment-16740577
 ] 

Eric Yang edited comment on YARN-9074 at 1/11/19 5:33 PM:
----------------------------------------------------------

[~uranus] During the design phase, the defined transition has one small 
difference as listed in this table:

| Transition | Remove workdir | Remove Docker Image |
| Reinitialize | No | Yes |
| Task Finished | Yes | Yes |
| Task Killed | Yes | Yes |

Workdir should not be removed, if the container is reinitialized.  This is best 
effort to keep task intermediate data.  However, container-executor code does 
not work with directory that is already existed.  Therefore, node manager also 
removes workdir for reinitialization in practice.  Hence, your suggested 
optimization can work for all cases.  I am inclined to include this patch, if 
[[email protected]] and [~csingh] agrees that we have no plan to make extra 
effort to save workdir during upgrade or stop/start of service.


was (Author: eyang):
[~uranus] During the design phase, the defined transition has one small 
difference as listed in this table:

| Transition | Remove workdir | Remove Docker Image |
| Reinitialize | No | Yes |
| Task Finished | Yes | Yes |
| Task Killed | Yes | Yes |

Workdir should not be removed, if the container is reinitialized.  This is best 
effort to keep task intermediate data.  However, container-executor code does 
not work with directory that is already existed.  Therefore, node manager also 
removes workdir for reinitialization in practice.  Hence, your suggested 
optimization can work for all cases.  I am inclined to include this patch, if 
[[email protected]] agrees.

> Docker container rm command should be executed after stop
> ---------------------------------------------------------
>
>                 Key: YARN-9074
>                 URL: https://issues.apache.org/jira/browse/YARN-9074
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Zhaohui Xin
>            Assignee: Zhaohui Xin
>            Priority: Major
>         Attachments: YARN-9074.001.patch, image-2018-12-01-11-36-12-448.png, 
> image-2018-12-01-11-38-18-191.png
>
>
> {code:java}
> @Override
> public void transition(ContainerImpl container, ContainerEvent event) {
> container.setIsReInitializing(false);
> // Set exit code to 0 on success 
> container.exitCode = 0;
> // TODO: Add containerWorkDir to the deletion service.
> if (DockerLinuxContainerRuntime.isDockerContainerRequested(
> container.daemonConf,
> container.getLaunchContext().getEnvironment())) {
> removeDockerContainer(container);
> }
> if (clCleanupRequired) {
> container.dispatcher.getEventHandler().handle(
> new ContainersLauncherEvent(container,
> ContainersLauncherEventType.CLEANUP_CONTAINER));
> }
> container.cleanup();
> }{code}
> Now, when container is finished, NM firstly execute "_docker rm xxx"_  to 
> remove it and this thread is placed in DeletionService. see more in YARN-5366 
> .
> Next, NM will execute "_docker stop_" and "docker kill" command. these tow 
> commands are wrapped up in ContainerCleanup thread and executed by 
> ContainersLauncher. see more in YARN-7644. 
> The above will cause the container's cleanup to be split into two threads. I 
> think we should refactor these code to make all docker container killing 
> process be place in ContainerCleanup thread and "_docker rm_" should be 
> executed last.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to