[ 
https://issues.apache.org/jira/browse/YARN-3243?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14359089#comment-14359089
 ] 

Wangda Tan commented on YARN-3243:
----------------------------------

Hi [~jianhe],
I've done all updates, thanks for your comments!

bq. CapacityScheduler, why following code is moved ?
This is because CS will call updateClusterResource after the a new node is 
added, which will set resourceLimits of queues. To set queue's resourceLimits, 
it needs to how much resource in each partition, so it need to call 
labelManager.activeNode before setting queue's resourceLimits.

Attaching new patch (ver.5)

> CapacityScheduler should pass headroom from parent to children to make sure 
> ParentQueue obey its capacity limits.
> -----------------------------------------------------------------------------------------------------------------
>
>                 Key: YARN-3243
>                 URL: https://issues.apache.org/jira/browse/YARN-3243
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: capacityscheduler, resourcemanager
>            Reporter: Wangda Tan
>            Assignee: Wangda Tan
>         Attachments: YARN-3243.1.patch, YARN-3243.2.patch, YARN-3243.3.patch, 
> YARN-3243.4.patch, YARN-3243.5.patch
>
>
> Now CapacityScheduler has some issues to make sure ParentQueue always obeys 
> its capacity limits, for example:
> 1) When allocating container of a parent queue, it will only check 
> parentQueue.usage < parentQueue.max. If leaf queue allocated a container.size 
> > (parentQueue.max - parentQueue.usage), parent queue can excess its max 
> resource limit, as following example:
> {code}
>         A  (usage=54, max=55)
>        /     \
>       A1     A2 (usage=1, max=55)
> (usage=53, max=53)
> {code}
> Queue-A2 is able to allocate container since its usage < max, but if we do 
> that, A's usage can excess A.max.
> 2) When doing continous reservation check, parent queue will only tell 
> children "you need unreserve *some* resource, so that I will less than my 
> maximum resource", but it will not tell how many resource need to be 
> unreserved. This may lead to parent queue excesses configured maximum 
> capacity as well.
> With YARN-3099/YARN-3124, now we have {{ResourceUsage}} class in each class, 
> *here is my proposal*:
> - ParentQueue will set its children's ResourceUsage.headroom, which means, 
> *maximum resource its children can allocate*.
> - ParentQueue will set its children's headroom to be (saying parent's name is 
> "qA"): min(qA.headroom, qA.max - qA.used). This will make sure qA's 
> ancestors' capacity will be enforced as well (qA.headroom is set by qA's 
> parent).
> - {{needToUnReserve}} is not necessary, instead, children can get how much 
> resource need to be unreserved to keep its parent's resource limit.
> - More over, with this, YARN-3026 will make a clear boundary between 
> LeafQueue and FiCaSchedulerApp, headroom will consider user-limit, etc.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to