[ 
https://issues.apache.org/jira/browse/YARN-3216?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14959360#comment-14959360
 ] 

Wangda Tan commented on YARN-3216:
----------------------------------

Thanks for update [~sunilg].
bq. As suggested in an earlier comment, I created a JIRA to track 
user-am-resource-limit per-partition separately. YARN-4229. I will link the 
issue here.
I think most of user-am-resource-limit per partition is covered by the patch, 
correct? One thing lacking in the patch is, we haven't consider number of 
active users for each partition, I'm not sure if that is the highest priority.

bq. So max-capacity will be helpful here, and as I see it getAMResourceLimit() 
is having this also considered.
I can remember what we discussed while doing YARN-2637, I think the approach in 
your patch is slightly different from YARN-2637:

YARN-2637 is: amLimit = max(queue-limit, queue-configured-capacity) * 
max-am-percent
And queue-limit is based on queue's max capacity as well as sibling used 
resource, so it is possible queue-limit less than queue-configured-capacity.

In your patch:
amLimit = max(queue-max-capacity, queue-configured-capacity) * max-am-percent. 
Since the queue-max-capacity will be *always* >= queue-configured-capacity.

Before we have queue-limit computation for node partition, I think we should 
use queue-configured-capacity. Otherwise, a queue has max-capacity >> 
configured-capacity will be problematic.

Few more comments:
- For the userAMCheck:
{code}
if (getNumActiveApplications() < 1) {
   //...
}
{code}
I think we should be able to allocate at least one AM container in the 
partition, correct? Just like what you did to queue:
{code}
                    || (Resources.lessThanOrEqual(resourceCalculator,
                        lastClusterResource, 
queueUsage.getAMUsed(partitionName),
                        Resources.none()))) {
{code}
- Could you use 
{{org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestUtils.getConfigurationWithQueueLabels(Configuration)}}
 instead of redefining the queue's configuration in the test?
- Could you add a brief description at each of your test case?
- I think there're two cases need to be covered: 1) User's AM limit 2) AM-limit 
of queue/user which can allocate multiple AM (I didn't see the case in your 
test, this is important to make sure we calculate total AM resource correct). 
3) When AM-usage will be updated after we update partition of nodes.

Thoughts?

> Max-AM-Resource-Percentage should respect node labels
> -----------------------------------------------------
>
>                 Key: YARN-3216
>                 URL: https://issues.apache.org/jira/browse/YARN-3216
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager
>            Reporter: Wangda Tan
>            Assignee: Sunil G
>            Priority: Critical
>         Attachments: 0001-YARN-3216.patch, 0002-YARN-3216.patch, 
> 0003-YARN-3216.patch, 0004-YARN-3216.patch, 0005-YARN-3216.patch, 
> 0006-YARN-3216.patch, 0007-YARN-3216.patch
>
>
> Currently, max-am-resource-percentage considers default_partition only. When 
> a queue can access multiple partitions, we should be able to compute 
> max-am-resource-percentage based on that.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to