[
https://issues.apache.org/jira/browse/YARN-8209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16455474#comment-16455474
]
Eric Badger commented on YARN-8209:
-----------------------------------
After looking at this failure a little more closely, I think that the fix is
going to require more than simply fixing the NPE that showed up in the Docker
deletion task. The NPE was caused because of the following lines:
{noformat}
String commandFile =
dockerClient.writeCommandToTempFile(dockerCommand,
nmContext.getContainers().get(ContainerId.fromString(containerId)),
nmContext);
{noformat}
Because the container is dead, the nmContext has no bookkeeping on it. So when
the code does a lookup of the container from the containerID, it doesn't find
it and passes back null. Then, we we enter into {{writeCommandToTempFile()}},
it immediately tries to get the containerId from the container and NPEs. This
is an easily fixable problem because we don't need to get the container from
the nmContext. {{writeCommandToTempFile()}} only uses the containerId, so we
can just pass that into the function instead of the container. I figured this
would fix the issue and be a minor patch update. However, that is not the case.
After fixing the NPE, I tested out the code again and found that some Docker
containers were failing to be cleaned up after the debug delay, because the
.cmd file was failing to be written. The .cmd file is what we pass to the
container-executor to invoke all docker commands, including {{docker rm}}.
{noformat}
2018-04-26 19:28:17,631 WARN [DeletionService #0] docker.DockerClient
(DockerClient.java:writeCommandToTempFile(151)) - Unable to write docker
command to
/tmp/hadoop-local/nmPrivate/application_1524758461670_0003/container_1524758461670_0003_01_000003
{noformat}
This is where it gets a little trickier. The container's directory inside of
nmPrivate seems to be getting cleaned up before the docker .cmd file is being
written, but it doesn't happen for all tasks. It actually appears to be a race
in the code between the deletion threads for the nmPrivate directory and the
{{DockerContainerDeletionTask}} thread.
{noformat:title=ContainerImpl.java:1530}
if (DockerLinuxContainerRuntime.isDockerContainerRequested(
container.getLaunchContext().getEnvironment())) {
removeDockerContainer(container);
}
if (clCleanupRequired) {
container.dispatcher.getEventHandler().handle(
new ContainersLauncherEvent(container,
ContainersLauncherEventType.CLEANUP_CONTAINER));
}
container.cleanup();
{noformat}
{{removeDockerContainer(container);}} handles the creation of the
{{DockerContainerDeletionTask}} thread, while {{container.cleanup();}} creates
the {{FileDeletionTask}} that removes the container's nmPrivate directory.
These both get scheduled to be executed after {{delete.debug-delay-sec}}
seconds. But, they both get added to thread pools where there is no guarantee
on the ordering between them. Sometimes, the {{DockerContainerDeletionTask}}
wins the race (most of the time in my testing), and some times the
{{FileDeletionTask}} wins the race. In the latter case, the docker .cmd file
will fail to be written and so the docker container will fail to get removed,
and will stick around in the Exited state until cleaned up manually.
Because of all of this, *I believe that the correct course of action is to
revert YARN-8064 to unblock the impending 3.1 release*. That way, if they would
like to go ahead with a new release, they may do so without this blocker.
However, we should try and get a fix in quickly so that this can go in, since
[~vinodkv] mentioned that he would like YARN-8064 in the next release.
As for a fix, my proposed solution would be to entirely remove the .cmd file
requirement on invoking the container-executor for docker commands. That way we
can just call the container-executor directly from the
{{DockerContainerDeletionTask}} without the need to write a file anywhere.
Since we don't need to write a file, we won't race with the container's
nmPrivate directory getting removed.
Additionally, I think this is just a good change overall regardless of fixing
this removal. It is unnecessary to write a file to disk just so that we can do
a {{docker rm <container>}}. If the disk is pegged, doing this write could be
very slow and unreliable. Plus, it's just overall unnecessary and if we don't
have to write to disk, we shouldn't. For things like the {{docker run}}
command, it makes sense to use a .cmd file because of the potentially massive
number of volume mount arguments. But, for smaller commands, I think we should
just add them in as separate commands for the container executor. This also
makes the code path significantly smaller within the container-executor, since
it wouldn't have to parse the .cmd file. Overall, I think it gives us many
wins.
cc [[email protected]], [~jlowe], [~eyang], [~Jim_Brennan]
> NPE in DeletionService
> ----------------------
>
> Key: YARN-8209
> URL: https://issues.apache.org/jira/browse/YARN-8209
> Project: Hadoop YARN
> Issue Type: Bug
> Reporter: Chandni Singh
> Assignee: Eric Badger
> Priority: Major
>
> {code:java}
> 2018-04-25 23:38:41,039 WARN concurrent.ExecutorHelper
> (ExecutorHelper.java:logThrowableFromAfterExecute(63)) - Caught exception in
> thread DeletionService #1:
> java.lang.NullPointerException
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.docker.DockerClient.writeCommandToTempFile(DockerClient.java:109)
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.docker.DockerCommandExecutor.executeDockerCommand(DockerCommandExecutor.java:85)
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.docker.DockerCommandExecutor.executeStatusCommand(DockerCommandExecutor.java:192)
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.docker.DockerCommandExecutor.getContainerStatus(DockerCommandExecutor.java:128)
> at
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.removeDockerContainer(LinuxContainerExecutor.java:935)
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.deletion.task.DockerContainerDeletionTask.run(DockerContainerDeletionTask.java:61)
> at
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
> at java.util.concurrent.FutureTask.run(FutureTask.java:266)
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
> at
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
> at
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
> at
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
> at java.lang.Thread.run(Thread.java:748){code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]