[ https://issues.apache.org/jira/browse/YARN-2496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14141661#comment-14141661 ]
Wangda Tan commented on YARN-2496: ---------------------------------- Hi [~cwelch], Thanks for your comments: 1) Regarding, {code} Headroom Calculation for JobA: userConsumed = 8G maxCapacityConsiderLabelA = 6G (Node1 only) headroom = -2G (assume it will normalize to 0G) {code} Currently, we calculate headroom by, bq. Headroom = min(userLimit, queue-max-cap, max-capacity-consider-label) - consumed The {max-capacity-consider-label} is queue-wise not app-wise, so in the queue-wise, the max-capacity-consider-label = node1 + node2. You can think that, the {max-capacity-consider-label} can guarantee it's always larger or equals than total resource of the queue will use. 2) Regarding bq. The "labels" are the labels for the queue, but the resource requests coming from the application can be a subset of that, no? So if application "a" is running on a queue with lables a and b, but it has a label expression of only a, which it is using for resource requests, it's going to get a headroom based on nodes with both labels a and b, but in fact it only has a "real" headroom for nodes with label "a" Yes/No, because even if app-a has "a" label in app-level, it's ResourceRequest(s) can also overwrite it and use b. Label in app-level is just a default label expression when its ResourceRequest doesn't set. So, app-a can still use all labels of queue. 3) Regarding, bq. On the parent/leaf refactor to share AbstractCSQueue - a great idea, thought about it myself when seeing the duplication, I agree with that, I think it may not too risky but it will hide functional changes we made. Let's get more ideas about this, because reverting it need some efforts. 4) Regarding, bq. CSQueueUtils - just removing a line, should revert Will do 5) Regarding, bq. SchedulerUtils.checkNodeLabelExpression - I think there is an issue here with the * case {{checkNodeLabelExpression}} is used for check if a ResourceRequest can be allocated on a node, we don't support specifying * in any label-expression (including ResourceRequest, ASC, queue-default-label-expression), that will cause many problem. Instead, we support * in queue's labels (not default-label-expression), which means queue *can* access any labels. The checking methods are {{checkQueueAccessToNode}} and {{checkQueueLabelExpression}}. Thanks, Wangda > [YARN-796] Changes for capacity scheduler to support allocate resource > respect labels > ------------------------------------------------------------------------------------- > > Key: YARN-2496 > URL: https://issues.apache.org/jira/browse/YARN-2496 > Project: Hadoop YARN > Issue Type: Sub-task > Components: resourcemanager > Reporter: Wangda Tan > Assignee: Wangda Tan > Attachments: YARN-2496.patch, YARN-2496.patch, YARN-2496.patch, > YARN-2496.patch > > > This JIRA Includes: > - Add/parse labels option to {{capacity-scheduler.xml}} similar to other > options of queue like capacity/maximum-capacity, etc. > - Include a "default-label-expression" option in queue config, if an app > doesn't specify label-expression, "default-label-expression" of queue will be > used. > - Check if labels can be accessed by the queue when submit an app with > labels-expression to queue or update ResourceRequest with label-expression > - Check labels on NM when trying to allocate ResourceRequest on the NM with > label-expression > - Respect labels when calculate headroom/user-limit -- This message was sent by Atlassian JIRA (v6.3.4#6332)