[ 
https://issues.apache.org/jira/browse/YARN-3119?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14304157#comment-14304157
 ] 

Vinod Kumar Vavilapalli commented on YARN-3119:
-----------------------------------------------

bq. IMHO, this could be problematic if a under-usage container (c1) wants to 
get more resource, but the resource is over-used by another container (c2). It 
is possible c1 tries to allocate but failed since memory is exhausted since NM 
needs some time to get resource back (kill c2).
This is the counter use-case that kills the feature. Leads to some pretty bad 
experience for apps for no fault of theirs.

Though I do remember this being proposed back at Yahoo! by [~rajesh.balamohan] 
few years ago - this helps improve throughput albeit with a loss of 
predictability.

Let's have a flag to turn this off by default. When it is on, we can do better:
 - kill the over-limit container if it exceeds scheduler.maximum-allocation-mb 
of the cluster.
 - let a container grow beyond its reservation if and only if there is 
unreserved capacity on the node. This way, we only let's containers grow when 
there is capacity that is not allocated for any containers. The limiting 
constraint is the following: {code}Sum of excess capacity used by all 
containers beyond their individual limits <= Unreserved capacity on the 
node.{code}
 - when aggregate usage crosses the threshold, only kill enough containers as 
is necessary instead of everything.

Anything more that can help? [~rajesh.balamohan]/[~leftnoteasy]?

> Memory limit check need not be enforced unless aggregate usage of all 
> containers is near limit
> ----------------------------------------------------------------------------------------------
>
>                 Key: YARN-3119
>                 URL: https://issues.apache.org/jira/browse/YARN-3119
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: nodemanager
>            Reporter: Anubhav Dhoot
>            Assignee: Anubhav Dhoot
>         Attachments: YARN-3119.prelim.patch
>
>
> Today we kill any container preemptively even if the total usage of 
> containers for that is well within the limit for YARN. Instead if we enforce 
> memory limit only if the total limit of all containers is close to some 
> configurable ratio of overall memory assigned to containers, we can allow for 
> flexibility in container memory usage without adverse effects. This is 
> similar in principle to how cgroups uses soft_limit_in_bytes.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to