[ 
https://issues.apache.org/jira/browse/YARN-4415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15049581#comment-15049581
 ] 

Wangda Tan commented on YARN-4415:
----------------------------------

[~Naganarasimha]/[~xinxianyin].

Let me try to summary what we were discussing.

There're 2 different configurations:
1) Accessible-node-labels for queue
2) Maximum-capacity for partitions

There're 4 different combinations for default values:
a. 1)=*, 2)=100
Pros:
- User doesn't need to update configurations a lot if new labels added (Assume 
partition will be shared to all queues)
Cons:
- User has to change configurations a lot if new labels added (Assume partition 
will be shared to few queues only)

b. 1)=*, 2)=0
Pros:
- User doesn't need to update configurations a lot if new labels added (Assume 
partition will be shared to few queues only)
Cons:
- User has to change configurations a lot if new labels added (Assume partition 
will be shared to all queues)

c. 1)=<empty>, 2=100
Same as b.

d. 1)=<empty>, 2=0
Same as b.

You can see that there're different pros and cons to choose default values of 
the two options. Frankly I don't have strong preference for all these choices. 
But since we have decided default values since 2.6, I would suggest don't 
change the default values.

But I think there's one thing we need to fix:
When queue.accessible-node-labels == *, 
{{QueueCapacitiesInfo#QueueCapacitiesInfo(QueueCapacities)}} should call 
RMNodeLabelsManager.getClusterNodeLabelNames to get all labels instead of 
calling {{getExistingNodeLabels}}. So after we add/remove labels, queue's 
capacities in webUI/REST response will be updated as well.

> Scheduler Web Ui shows max capacity for the queue is 100% but when we submit 
> application doesnt get assigned
> ------------------------------------------------------------------------------------------------------------
>
>                 Key: YARN-4415
>                 URL: https://issues.apache.org/jira/browse/YARN-4415
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: capacity scheduler, resourcemanager
>    Affects Versions: 2.7.2
>            Reporter: Naganarasimha G R
>            Assignee: Naganarasimha G R
>         Attachments: App info with diagnostics info.png, 
> capacity-scheduler.xml, screenshot-1.png
>
>
> Steps to reproduce the issue :
> Scenario 1:
> # Configure a queue(default) with accessible node labels as *
> # create a exclusive partition *xxx* and map a NM to it
> # ensure no capacities are configured for default for label xxx
> # start an RM app with queue as default and label as xxx
> # application is stuck but scheduler ui shows 100% as max capacity for that 
> queue
> Scenario 2:
> # create a nonexclusive partition *sharedPartition* and map a NM to it
> # ensure no capacities are configured for default queue
> # start an RM app with queue as *default* and label as *sharedPartition*
> # application is stuck but scheduler ui shows 100% as max capacity for that 
> queue for *sharedPartition*
> For both issues cause is the same default max capacity and abs max capacity 
> is set to Zero %



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to