[ https://issues.apache.org/jira/browse/YARN-4382?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15021630#comment-15021630 ]
lachisis commented on YARN-4382: -------------------------------- If lots of container hierarchys remained, it will make the cpu busy of this node, even when no jobs are running. ------------------------------------------------------------------------------ PerfTop: 129889 irqs/sec kernel:76.3% [100000 cycles], (all, 16 CPUs) ------------------------------------------------------------------------------ samples pcnt kernel function _______ _____ _______________ 117166.00 - 59.1% : tg_shares_up 35688.00 - 18.0% : _spin_lock_irqsave 12045.00 - 6.1% : __set_se_shares > Container hierarchy in cgroup may remain for ever after the container have be > terminated > ---------------------------------------------------------------------------------------- > > Key: YARN-4382 > URL: https://issues.apache.org/jira/browse/YARN-4382 > Project: Hadoop YARN > Issue Type: Bug > Components: nodemanager > Affects Versions: 2.5.2 > Reporter: lachisis > > If we use LinuxContainerExecutor to executor the containers, this question > may happens. > In the common case, when a container run, a corresponding hierarchy will be > created in cgroup dir. And when the container terminate, the hierarchy will > be delete in some seconds(this time can be configured by > yarn.nodemanager.linux-container-executor.cgroups.delete-delay-ms). > In the code, I find that, CgroupsLCEResource send a signal to kill container > process asynchronously, and in the same time, it will try to delete the > container hierarchy in configured "delete-delay-ms" times. > But if the container process be killed for seconds which large than > "delete-delay-ms" time, the container hierarchy will remain for ever. > -- This message was sent by Atlassian JIRA (v6.3.4#6332)