Alejandro Abdelnur created YARN-951:
---------------------------------------

             Summary: Add hard minimum resource capabilities for container 
launching
                 Key: YARN-951
                 URL: https://issues.apache.org/jira/browse/YARN-951
             Project: Hadoop YARN
          Issue Type: Bug
          Components: nodemanager
    Affects Versions: 2.1.0-beta
            Reporter: Alejandro Abdelnur


This is a follow up of YARN-789, which enabled FairScheduler to handle zero 
capabilities resource requests in one dimension (either zero CPU or zero 
memory).

When resource enforcement is enabled (cgroups for CPU and 
ProcfsBasedProcessTree for memory) we cannot use zero because the underlying 
container processes will be killed.

We need to introduce an absolute or hard minimum:

* For CPU. Hard enforcement can be done via a cgroup cpu controller. Using an 
absolute minimum of a few CPU shares (ie 10) in the LinuxContainerExecutor we 
ensure there is enough CPU cycles to run the sleep process. This absolute 
minimum would only kick-in if zero is allowed, otherwise will never kick in as 
the shares for 1 CPU are 1024.

* For Memory. Hard enforcement is currently done by the 
ProcfsBasedProcessTree.java, using a minimum absolute of 1 or 2 MBs would take 
care of zero memory resources. And again, this absolute minimum would only 
kick-in if zero is allowed, otherwise will never kick in as the increment 
memory is in several MBs if not 1GB.

There would be no default for this hard minimum, if not set no correction will 
be done. If set, then the MAX(hard-minimum, container-resource-capability) will 
be used. 

Effectively there will not be any impact unless the hard minimum capabilities 
are explicitly set.

And, even if set, unless the scheduler is configured to allow zero 
capabilities, the hard-minimum value will not kick in unless is set to a value 
higher than the MIN capabilities for a container.

Expected values, when set, would be 10 shares for CPU and 2 MB for memory.


--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to