[ 
https://issues.apache.org/jira/browse/YARN-8924?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Zhankun Tang updated YARN-8924:
-------------------------------
    Description: 
This is to re-think the legacy configuration/code of CPU resource isolation. In 
YARN-3542, we involve _CGroupsCpuResourceHandlerImpl_ based on new 
_ResourceHandler_ mechanism but leaves the configuration 
"yarn.nodemanager.linux-container-executor.resources-handler.class" there for a 
long time. Now it seems confusing to the end user.

Check YARN-6729, one sets "_DefaultLCEResourcesHandler_" and found that give 
"percentage-physical-cpu-limit" a value less than "100" doesn't work.

As far as I know, internally, the _CgroupsLCEResourcesHandler_ and 
_DefaultLCEResourcesHandler are_ both deprecated. YARN won't use them anymore.

Instead, YARN uses _CGroupsCpuResourceHandlerImpl_ to do CPU isolation and only 
in LCE. If we want to enforce CPU usage, we must set LCE and 
CgroupsLCEResourceHandler like this:
{noformat}
<property>
  <description>who will execute(launch) the containers.</description>
  <name>yarn.nodemanager.container-executor.class</name>
  
<value>org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor</value>
</property>
<property>
 <description>The class which should help the LCE handle 
resources.</description>
 <name>yarn.nodemanager.linux-container-executor.resources-handler.class</name>
 
<value>org.apache.hadoop.yarn.server.nodemanager.util.CgroupsLCEResourcesHandler<value>
 </property>{noformat}
Based on above settings can the CPU related settings like 
"percentage-physical-cpu-limit" works as expected.

To avoid confusing like YARN-6729, we can do two things:
 # More clear document about how should user configure CPU 
isolation/enforcement in "NodeManagerCgroups.md"
 # Make "ResourceHandlerModuler" stable and remove legacy code and update the 
document to recommend new setting "yarn.nodemanager.resource.cpu.enabled"

Thoughts? [~leftnoteasy], [~vinodkv], [~vvasudev]

  was:
This is to re-think the legacy configuration/code of CPU resource isolation. In 
YARN-3542, we involve _CGroupsCpuResourceHandlerImpl_ based on new 
_ResourceHandler_ mechanism but leaves the configuration 
"yarn.nodemanager.linux-container-executor.resources-handler.class" there for a 
long time. Now it seems confusing to the end user.

Check YARN-6729, one sets "_DefaultLCEResourcesHandler_" and found that give 
"percentage-physical-cpu-limit" a value less than "100" doesn't work.

As far as I know, internally, the _CgroupsLCEResourcesHandler_ and 
_DefaultLCEResourcesHandler are_ both deprecated. YARN won't use them anymore.

Instead, YARN uses _CGroupsCpuResourceHandlerImpl_ to do CPU isolation and only 
in LCE. If we want to enforce CPU usage, we must set LCE and 
CgroupsLCEResourceHandler like this:
{noformat}
<property>
  <description>who will execute(launch) the containers.</description>
  <name>yarn.nodemanager.container-executor.class</name>
  
<value>org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor</value>
</property>
<property>
 <description>The class which should help the LCE handle 
resources.</description>
 <name>yarn.nodemanager.linux-container-executor.resources-handler.class</name>
 
<value>org.apache.hadoop.yarn.server.nodemanager.util.CgroupsLCEResourcesHandler<value>
 </property>{noformat}
Based on above settings can the CPU related settings like 
"percentage-physical-cpu-limit" works as expected.

To avoid confusing like YARN-6729, we can do two things:
 # More clear document about how should user configure CPU 
isolation/enforcement in "NodeManagerCgroups.md"
 # Make "ResourceHandlerModuler" stable and remove legacy code and update the 
document to recommend new setting "yarn.nodemanager.resource.cpu.enabled"

Thoughts? [~leftnoteasy], [~vinodkv]


> Refine the document or code related to legacy CPU isolation/enforcement
> -----------------------------------------------------------------------
>
>                 Key: YARN-8924
>                 URL: https://issues.apache.org/jira/browse/YARN-8924
>             Project: Hadoop YARN
>          Issue Type: Improvement
>            Reporter: Zhankun Tang
>            Assignee: Zhankun Tang
>            Priority: Minor
>
> This is to re-think the legacy configuration/code of CPU resource isolation. 
> In YARN-3542, we involve _CGroupsCpuResourceHandlerImpl_ based on new 
> _ResourceHandler_ mechanism but leaves the configuration 
> "yarn.nodemanager.linux-container-executor.resources-handler.class" there for 
> a long time. Now it seems confusing to the end user.
> Check YARN-6729, one sets "_DefaultLCEResourcesHandler_" and found that give 
> "percentage-physical-cpu-limit" a value less than "100" doesn't work.
> As far as I know, internally, the _CgroupsLCEResourcesHandler_ and 
> _DefaultLCEResourcesHandler are_ both deprecated. YARN won't use them anymore.
> Instead, YARN uses _CGroupsCpuResourceHandlerImpl_ to do CPU isolation and 
> only in LCE. If we want to enforce CPU usage, we must set LCE and 
> CgroupsLCEResourceHandler like this:
> {noformat}
> <property>
>   <description>who will execute(launch) the containers.</description>
>   <name>yarn.nodemanager.container-executor.class</name>
>   
> <value>org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor</value>
> </property>
> <property>
>  <description>The class which should help the LCE handle 
> resources.</description>
>  
> <name>yarn.nodemanager.linux-container-executor.resources-handler.class</name>
>  
> <value>org.apache.hadoop.yarn.server.nodemanager.util.CgroupsLCEResourcesHandler<value>
>  </property>{noformat}
> Based on above settings can the CPU related settings like 
> "percentage-physical-cpu-limit" works as expected.
> To avoid confusing like YARN-6729, we can do two things:
>  # More clear document about how should user configure CPU 
> isolation/enforcement in "NodeManagerCgroups.md"
>  # Make "ResourceHandlerModuler" stable and remove legacy code and update the 
> document to recommend new setting "yarn.nodemanager.resource.cpu.enabled"
> Thoughts? [~leftnoteasy], [~vinodkv], [~vvasudev]



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to