[ 
https://issues.apache.org/jira/browse/YARN-8209?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16455610#comment-16455610
 ] 

Eric Yang commented on YARN-8209:
---------------------------------

[~ebadger] Thank you for the insights. Cmd file is a serialization contract 
between node manager and container-executor.  If we want to step away from this 
contract, we need an alternate proposal.  If we read cmd from stdin, we still 
need to handle possible buffer overflow, and proper passing of environment 
variables to docker.  This is likely going in full circle that is likely ending 
to have a data file between node manager and container-executor.  How about we 
make a small modification for docker rm command to skip generation of cmd file 
and pass the docker container id via environment variable to 
container-executor.  If container-executor can not find .cmd file, and 
environment variable matches to delete a docker container, and it will perform 
accordingly.  This will decouple dependency for docker rm on .cmd file, and 
avoid the race condition between FileDeletionTask and 
DockerContainerDeletionTask.  Can this be a possible workaround to the race 
condition problem?

> NPE in DeletionService
> ----------------------
>
>                 Key: YARN-8209
>                 URL: https://issues.apache.org/jira/browse/YARN-8209
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Chandni Singh
>            Assignee: Eric Badger
>            Priority: Major
>
> {code:java}
> 2018-04-25 23:38:41,039 WARN  concurrent.ExecutorHelper 
> (ExecutorHelper.java:logThrowableFromAfterExecute(63)) - Caught exception in 
> thread DeletionService #1:
> java.lang.NullPointerException
>         at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.docker.DockerClient.writeCommandToTempFile(DockerClient.java:109)
>         at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.docker.DockerCommandExecutor.executeDockerCommand(DockerCommandExecutor.java:85)
>         at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.docker.DockerCommandExecutor.executeStatusCommand(DockerCommandExecutor.java:192)
>         at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.docker.DockerCommandExecutor.getContainerStatus(DockerCommandExecutor.java:128)
>         at 
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.removeDockerContainer(LinuxContainerExecutor.java:935)
>         at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.deletion.task.DockerContainerDeletionTask.run(DockerContainerDeletionTask.java:61)
>         at 
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
>         at java.util.concurrent.FutureTask.run(FutureTask.java:266)
>         at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
>         at 
> java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
>         at java.lang.Thread.run(Thread.java:748){code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to