[
https://issues.apache.org/jira/browse/YARN-8629?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16572079#comment-16572079
]
Suma Shivaprasad commented on YARN-8629:
----------------------------------------
CGroupsHandler may get called multiple times since LCE.reapContainer and
LCE.handleLaunchForLaunchType both call postComplete which in turn calls
checkAndDeleteCgroup. The Cgroups folder does not exist in the second run and
hence results in this error
> Container cleanup fails while trying to delete Cgroups
> ------------------------------------------------------
>
> Key: YARN-8629
> URL: https://issues.apache.org/jira/browse/YARN-8629
> Project: Hadoop YARN
> Issue Type: Bug
> Reporter: Yesha Vora
> Assignee: Suma Shivaprasad
> Priority: Critical
> Attachments: YARN-8629.1.patch
>
>
> When an application failed to launch container successfully, the cleanup of
> container also failed with below message.
> {code}
> 2018-08-06 03:28:20,351 WARN resources.CGroupsHandlerImpl
> (CGroupsHandlerImpl.java:checkAndDeleteCgroup(523)) - Failed to read cgroup
> tasks file.
> java.io.FileNotFoundException:
> /sys/fs/cgroup/cpu,cpuacct/hadoop-yarn-tmp-cxx/container_e02_1533336898541_0010_20_000002/tasks
> (No such file or directory)
> at java.io.FileInputStream.open0(Native Method)
> at java.io.FileInputStream.open(FileInputStream.java:195)
> at java.io.FileInputStream.<init>(FileInputStream.java:138)
> at java.io.FileInputStream.<init>(FileInputStream.java:93)
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.CGroupsHandlerImpl.checkAndDeleteCgroup(CGroupsHandlerImpl.java:507)
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.CGroupsHandlerImpl.deleteCGroup(CGroupsHandlerImpl.java:542)
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.CGroupsCpuResourceHandlerImpl.postComplete(CGroupsCpuResourceHandlerImpl.java:238)
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.linux.resources.ResourceHandlerChain.postComplete(ResourceHandlerChain.java:111)
> at
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.postComplete(LinuxContainerExecutor.java:964)
> at
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor.reapContainer(LinuxContainerExecutor.java:787)
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainerLaunch.cleanupContainer(ContainerLaunch.java:821)
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncher.handle(ContainersLauncher.java:161)
> at
> org.apache.hadoop.yarn.server.nodemanager.containermanager.launcher.ContainersLauncher.handle(ContainersLauncher.java:57)
> at
> org.apache.hadoop.yarn.event.AsyncDispatcher.dispatch(AsyncDispatcher.java:197)
> at
> org.apache.hadoop.yarn.event.AsyncDispatcher$1.run(AsyncDispatcher.java:126)
> at java.lang.Thread.run(Thread.java:748)
> 2018-08-06 03:28:20,372 WARN resources.CGroupsHandlerImpl
> (CGroupsHandlerImpl.java:checkAndDeleteCgroup(523)) - Failed to read cgroup
> tasks file.{code}
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]