[
https://issues.apache.org/jira/browse/YARN-9562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16896610#comment-16896610
]
Eric Yang commented on YARN-9562:
---------------------------------
[~ebadger] I couldn't get very far with patch 002 with YARN-9561 patch 002. I
have configured the following configs:
{code}
<property>
<name>yarn.nodemanager.runtime.linux.allowed-runtimes</name>
<value>default,docker,runc</value>
<description>
Comma separated list of runtimes that are allowed when using
LinuxContainerExecutor. The allowed values are default, docker, and
javasandbox.
</description>
</property>
<property>
<name>yarn.nodemanager.runtime.linux.runc.image-tag-to-manifest-plugin.local-hash-file</name>
<value>/tmp/centos</value>
</property>
{code}
Node manager refused to start with this error:
{code}
2019-07-30 23:04:56,998 WARN
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor:
IOException executing command:
java.io.InterruptedIOException: java.lang.InterruptedException
at org.apache.hadoop.util.Shell.runCommand(Shell.java:1011)
at org.apache.hadoop.util.Shell.run(Shell.java:901)
at
org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1213)
at
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:154)
at
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.RuncContainerRuntime$1.run(RuncContainerRuntime.java:264)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.lang.InterruptedException
at java.lang.Object.wait(Native Method)
at java.lang.Object.wait(Object.java:502)
at java.lang.UNIXProcess.waitFor(UNIXProcess.java:395)
at org.apache.hadoop.util.Shell.runCommand(Shell.java:1001)
... 11 more
2019-07-30 23:04:56,999 WARN
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.RuncContainerRuntime:
Failed to reap old runc layer mounts
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationException:
java.io.InterruptedIOException: java.lang.InterruptedException
at
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:185)
at
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.runtime.RuncContainerRuntime$1.run(RuncContainerRuntime.java:264)
at
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.runAndReset(FutureTask.java:308)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$301(ScheduledThreadPoolExecutor.java:180)
at
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:294)
at
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Caused by: java.io.InterruptedIOException: java.lang.InterruptedException
at org.apache.hadoop.util.Shell.runCommand(Shell.java:1011)
at org.apache.hadoop.util.Shell.run(Shell.java:901)
at
org.apache.hadoop.util.Shell$ShellCommandExecutor.execute(Shell.java:1213)
at
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:154)
... 8 more
Caused by: java.lang.InterruptedException
at java.lang.Object.wait(Native Method)
at java.lang.Object.wait(Object.java:502)
at java.lang.UNIXProcess.waitFor(UNIXProcess.java:395)
at org.apache.hadoop.util.Shell.runCommand(Shell.java:1001)
... 11 more
{code}
It looks like it tries to clean up existing runc mounts but failed to find any
and crashed.
# What are the minimum configs required for this feature to work?
# What does the Image Tag to Hash file look like?
> Add Java changes for the new RuncContainerRuntime
> -------------------------------------------------
>
> Key: YARN-9562
> URL: https://issues.apache.org/jira/browse/YARN-9562
> Project: Hadoop YARN
> Issue Type: Sub-task
> Reporter: Eric Badger
> Assignee: Eric Badger
> Priority: Major
> Attachments: YARN-9562.001.patch, YARN-9562.002.patch
>
>
> This JIRA will be used to add the Java changes for the new
> RuncContainerRuntime. This will work off of YARN-9560 to use much of the
> existing DockerLinuxContainerRuntime code once it is moved up into an
> abstract class that can be extended.
--
This message was sent by Atlassian JIRA
(v7.6.14#76016)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]