[ 
https://issues.apache.org/jira/browse/YARN-4415?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15803880#comment-15803880
 ] 

Ying Zhang commented on YARN-4415:
----------------------------------

Hi [~Naganarasimha], [~leftnoteasy], [~xinxianyin], we've encountered the same 
issue during our test. Noticed that  this JIRA has been opened for a while. I 
understand the reason [~leftnoteasy] and [~xinxianyin] have for choosing 0 or 
100 as default max capacity value if not set. But the current issue is we use 0 
as default max capacity internally (using macro CSQueueUtils.EPSILON) when 
allocating resource but in RM Scheduler UI showing 100 as max capacity (due to 
the reason class PartitionQueueCapacitiesInfo use 100 as default value in this 
case). Would we change to use same default value here to avoid the 
inconsistency?
{quote}
But I think there's one thing we need to fix:
When queue.accessible-node-labels == *, 
QueueCapacitiesInfo#QueueCapacitiesInfo(QueueCapacities) should call 
RMNodeLabelsManager.getClusterNodeLabelNames to get all labels instead of 
calling getExistingNodeLabels. So after we add/remove labels, queue's 
capacities in webUI/REST response will be updated as well.
{quote}
[~leftnoteasy], I'm not sure I understand what you mean, but it might be good 
that we keep using getExistingNodeLabels so that only the node label partitions 
that the queue has access to can be shown in RM Scheduler UI.

> Scheduler Web Ui shows max capacity for the queue is 100% but when we submit 
> application doesnt get assigned
> ------------------------------------------------------------------------------------------------------------
>
>                 Key: YARN-4415
>                 URL: https://issues.apache.org/jira/browse/YARN-4415
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: capacity scheduler, resourcemanager
>    Affects Versions: 2.7.2
>            Reporter: Naganarasimha G R
>            Assignee: Naganarasimha G R
>         Attachments: App info with diagnostics info.png, 
> capacity-scheduler.xml, screenshot-1.png
>
>
> Steps to reproduce the issue :
> Scenario 1:
> # Configure a queue(default) with accessible node labels as *
> # create a exclusive partition *xxx* and map a NM to it
> # ensure no capacities are configured for default for label xxx
> # start an RM app with queue as default and label as xxx
> # application is stuck but scheduler ui shows 100% as max capacity for that 
> queue
> Scenario 2:
> # create a nonexclusive partition *sharedPartition* and map a NM to it
> # ensure no capacities are configured for default queue
> # start an RM app with queue as *default* and label as *sharedPartition*
> # application is stuck but scheduler ui shows 100% as max capacity for that 
> queue for *sharedPartition*
> For both issues cause is the same default max capacity and abs max capacity 
> is set to Zero %



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to