[ 
https://issues.apache.org/jira/browse/YARN-5040?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15688554#comment-15688554
 ] 

Tao Jie commented on YARN-5040:
-------------------------------

We have met the same problem. We set 
yarn.nodemanager.resource.percentage-physical-cpu-limit=80 and tested both 
kernel version 2.6.32-642 and 3.10.103 with hadoop-2.7.1 by running Terasort, 
the kernel crashed.
Then we updated kernel version to 4.8.1, the kernel panic didn't happen any 
more. It seems that this kernel panic is due to kernel cgroup bug, which is 
fixed in higher kernel version. 

> CPU Isolation with CGroups triggers kernel panics on Centos 7.1/7.2 when 
> yarn.nodemanager.resource.percentage-physical-cpu-limit < 100
> --------------------------------------------------------------------------------------------------------------------------------------
>
>                 Key: YARN-5040
>                 URL: https://issues.apache.org/jira/browse/YARN-5040
>             Project: Hadoop YARN
>          Issue Type: Bug
>    Affects Versions: 2.9.0
>            Reporter: Sidharta Seethana
>            Assignee: Varun Vasudev
>
> /cc [~vvasudev]
> We have been running some benchmarks internally with resource isolation 
> enabled. We have consistently run into kernel panics when running a large job 
> ( a large pi job, terasort ). These kernel panics wen't away when we set 
> yarn.nodemanager.resource.percentage-physical-cpu-limit=100 . Anything less 
> than 100 triggers different behavior in YARN's CPU resource handler which 
> seems to cause these issues. Looking at the kernel crash dumps, the 
> backtraces were different - sometimes pointing to java processes, sometimes 
> not. 
> Kernel versions used : 3.10.0-229.14.1.el7.x86_64 and 
> 3.10.0-327.13.1.el7.x86_64 . 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to