[ 
https://issues.apache.org/jira/browse/YARN-9562?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16971141#comment-16971141
 ] 

Shane Kumpf commented on YARN-9562:
-----------------------------------

Thanks for the new patches, [~ebadger]! I was able to successfully run a dshell 
and MR PI job leveraging runC with these patches.
{code}
[root@y7001 ~]# runc list
ID                                           PID         STATUS      BUNDLE     
                                                                                
                         CREATED                          OWNER
container_e02_1573397883403_0003_01_000002   32546       running     
/tmp/hadoop-yarn/nm-local-dir/nmPrivate/application_1573397883403_0003/container_e02_1573397883403_0003_01_000002
   2019-11-10T15:03:22.810203063Z   root
{code}

However, clean up of the container resources is failing due to permission 
denied issues:
{code}
2019-11-10 15:03:11,637 INFO 
org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor: Deleting 
absolute path : 
/tmp/hadoop-yarn/nm-local-dir/usercache/hadoopuser/appcache/application_1573397883403_0002/container_e02_1573397883403_0002_01_000002
2019-11-10 15:03:11,653 WARN 
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor:
 Shell execution returned exit code: 255. Privileged Execution Operation Stderr:
Nonzero exit code=-1, error message='Unknown error code'

Stdout: main : command provided 3
main : run as user is nobody
main : requested yarn user is hadoopuser
failed to rmdir application_1573397883403_0002: Permission denied
failed to rmdir appcache: Permission denied
failed to rmdir filecache: Permission denied
failed to rmdir hadoopuser: Permission denied
failed to rmdir usercache: Permission denied
failed to rmdir filecache: Permission denied
failed to rmdir nm-local-dir: Permission denied
failed to rmdir hadoop-yarn: Directory not empty
failed to rmdir private_slash_tmp: Directory not empty
Error while deleting 
/tmp/hadoop-yarn/nm-local-dir/usercache/hadoopuser/appcache/application_1573397883403_0002/container_e02_1573397883403_0002_01_000002:
 39 (Directory not empty)

Full command array for failed execution:
[/usr/local/hadoop/bin/container-executor, nobody, hadoopuser, 3, 
/tmp/hadoop-yarn/nm-local-dir/usercache/hadoopuser/appcache/application_1573397883403_0002/container_e02_1573397883403_0002_01_000002]
2019-11-10 15:03:11,653 ERROR 
org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor: DeleteAsUser 
for 
/tmp/hadoop-yarn/nm-local-dir/usercache/hadoopuser/appcache/application_1573397883403_0002/container_e02_1573397883403_0002_01_000002
 returned with exit code: 255
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationException:
 ExitCodeException exitCode=255: Nonzero exit code=-1, error message='Unknown 
error code'

        at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:182)
        at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.privileged.PrivilegedOperationExecutor.executePrivilegedOperation(PrivilegedOperationExecutor.java:208)
        at 
org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.deleteAsUser(LinuxContainerExecutor.java:871)
        at 
org.apache.hadoop.yarn.server.nodemanager.containermanager.deletion.task.FileDeletionTask.run(FileDeletionTask.java:125)
        at 
java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
        at 
java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:293)
        at 
java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at 
java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
Caused by: ExitCodeException exitCode=255: Nonzero exit code=-1, error 
message='Unknown error code'
{code}

> Add Java changes for the new RuncContainerRuntime
> -------------------------------------------------
>
>                 Key: YARN-9562
>                 URL: https://issues.apache.org/jira/browse/YARN-9562
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Eric Badger
>            Assignee: Eric Badger
>            Priority: Major
>         Attachments: YARN-9562.001.patch, YARN-9562.002.patch, 
> YARN-9562.003.patch, YARN-9562.004.patch, YARN-9562.005.patch, 
> YARN-9562.006.patch, YARN-9562.007.patch, YARN-9562.008.patch, 
> YARN-9562.009.patch, YARN-9562.010.patch, YARN-9562.011.patch, 
> YARN-9562.012.patch, YARN-9562.013.patch, YARN-9562.014.patch
>
>
> This JIRA will be used to add the Java changes for the new 
> RuncContainerRuntime. This will work off of YARN-9560 to use much of the 
> existing DockerLinuxContainerRuntime code once it is moved up into an 
> abstract class that can be extended. 



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to