[ 
https://issues.apache.org/jira/browse/YARN-2496?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14141661#comment-14141661
 ] 

Wangda Tan commented on YARN-2496:
----------------------------------

Hi [~cwelch],
Thanks for your comments:

1) Regarding,
{code}
Headroom Calculation for JobA:
userConsumed = 8G
maxCapacityConsiderLabelA = 6G (Node1 only)
headroom = -2G (assume it will normalize to 0G)
{code}

Currently, we calculate headroom by,
bq. Headroom = min(userLimit, queue-max-cap, max-capacity-consider-label) - 
consumed
The {max-capacity-consider-label} is queue-wise not app-wise, so in the 
queue-wise, the max-capacity-consider-label = node1 + node2.
You can think that, the {max-capacity-consider-label} can guarantee it's always 
larger or equals than total resource of the queue will use.

2) Regarding
bq. The "labels" are the labels for the queue, but the resource requests coming 
from the application can be a subset of that, no? So if application "a" is 
running on a queue with lables a and b, but it has a label expression of only 
a, which it is using for resource requests, it's going to get a headroom based 
on nodes with both labels a and b, but in fact it only has a "real" headroom 
for nodes with label "a"
Yes/No, because even if app-a has "a" label in app-level, it's 
ResourceRequest(s) can also overwrite it and use b. Label in app-level is just 
a default label expression when its ResourceRequest doesn't set. So, app-a can 
still use all labels of queue.

3) Regarding,
bq. On the parent/leaf refactor to share AbstractCSQueue - a great idea, 
thought about it myself when seeing the duplication,
I agree with that, I think it may not too risky but it will hide functional 
changes we made. Let's get more ideas about this, because reverting it need 
some efforts.

4) Regarding,
bq. CSQueueUtils - just removing a line, should revert
Will do 

5) Regarding,
bq. SchedulerUtils.checkNodeLabelExpression - I think there is an issue here 
with the * case
{{checkNodeLabelExpression}} is used for check if a ResourceRequest can be 
allocated on a node, we don't support specifying * in any label-expression 
(including ResourceRequest, ASC, queue-default-label-expression), that will 
cause many problem.
Instead, we support * in queue's labels (not default-label-expression), which 
means queue *can* access any labels. The checking methods are 
{{checkQueueAccessToNode}} and {{checkQueueLabelExpression}}.

Thanks,
Wangda

> [YARN-796] Changes for capacity scheduler to support allocate resource 
> respect labels
> -------------------------------------------------------------------------------------
>
>                 Key: YARN-2496
>                 URL: https://issues.apache.org/jira/browse/YARN-2496
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager
>            Reporter: Wangda Tan
>            Assignee: Wangda Tan
>         Attachments: YARN-2496.patch, YARN-2496.patch, YARN-2496.patch, 
> YARN-2496.patch
>
>
> This JIRA Includes:
> - Add/parse labels option to {{capacity-scheduler.xml}} similar to other 
> options of queue like capacity/maximum-capacity, etc.
> - Include a "default-label-expression" option in queue config, if an app 
> doesn't specify label-expression, "default-label-expression" of queue will be 
> used.
> - Check if labels can be accessed by the queue when submit an app with 
> labels-expression to queue or update ResourceRequest with label-expression
> - Check labels on NM when trying to allocate ResourceRequest on the NM with 
> label-expression
> - Respect  labels when calculate headroom/user-limit



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to