[
https://issues.apache.org/jira/browse/YARN-5892?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15972782#comment-15972782
]
Eric Payne commented on YARN-5892:
----------------------------------
[~leftnoteasy], thank you very much for your in-depth review and comments.
{quote}
1) When there're several active users with \[combined sum of\] weights < 1. ...
However in this implementation ... a1 can get all queue's resource (because
#active-user-applied-weights = 1/0.3) while a2 got starved.
{quote}
No, that's not how it will work with this implementation.
[~sunilg] had a similar question
[above|https://issues.apache.org/jira/browse/YARN-5892?focusedCommentId=15966197&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-15966197].
Having a combined sum of weights < 1 works because {{userLimitResource}} (the
return value of {{computeUserLimit}}) is only ever used by
{{getComputedResourceLimitFor\[Active|All\]Users}}, which multiplies the value
of {{userLimitResource}} by the appropriate user weight(s). This will result in
the correct value of {{userLimit}} for each specific user. If the sum of active
user(s)'s weight(s) is < 1, then it is true that {{userLimitResource}} is
larger than the actual user limit, and sometimes even larger than the actual
number of resources used. However, this algorithm calculates {{userLimit}}
correctly and consistently when
{{getComputedResourceLimitFor\[Active|All\]Users}} multiplies it by each user's
weight.
bq. 2) I would like to prevent setting user's weight to <= 0.
Instead of a warning, I will cause the parse of the CS config to fail if weight
is < 0. I would like [~jlowe]'s and [~nroberts]'s feedback on whether or not
{{weight == 0}} is reasonable and consistent.
{quote}
Generally speaking, set user weight < 1 is a reasonable requirement however I
don't think we're ready for that. It looks there're bunch of things we need to
do to make #2 and related preemption logic works properly.
{quote}
I am afraid that I disagree for reasons stated above. #2 can be addressed with
a simple check that treats failure the same as other parsing issues. The one
concern that remains in my mind is to ensure that this algorithm calculates
{{allUserLimit}} correctly for preemption. I have not yet combined and tested
this patch with the one for YARN-2113. I will do so and post my findings.
{quote}
Beyond that, I suggest to make #active-users-times-weight can updated in O(1)
for every changes to active users set or any active user's weight get updated.
{quote}
Yes, good point. Although #active-users-times-weight and #users-times-weight
are only calculated in {{computeUserLimit}}, and {{computeUserLimit}} is only
called when a significant event happens, we could eliminate the need to
calculate this for things like container allocate and container free events. I
will modify the patch to do this.
{quote}
Also, weight of users applies to hard limit of user (user limit factor) as
well. This is a gray area to me, since it may cause some issue of resource
planning (one more factor apply to maximum resource of user). Would like to
hear thoughts from Jason Lowe/Sunil G as well.
{quote}
I look forward to [~jlowe]'s and [~sunilg]'s comments
> Capacity Scheduler: Support user-specific minimum user limit percent
> --------------------------------------------------------------------
>
> Key: YARN-5892
> URL: https://issues.apache.org/jira/browse/YARN-5892
> Project: Hadoop YARN
> Issue Type: Improvement
> Components: capacityscheduler
> Reporter: Eric Payne
> Assignee: Eric Payne
> Attachments: Active users highlighted.jpg, YARN-5892.001.patch,
> YARN-5892.002.patch, YARN-5892.003.patch, YARN-5892.004.patch,
> YARN-5892.005.patch, YARN-5892.006.patch, YARN-5892.007.patch,
> YARN-5892.008.patch, YARN-5892.009.patch
>
>
> Currently, in the capacity scheduler, the {{minimum-user-limit-percent}}
> property is per queue. A cluster admin should be able to set the minimum user
> limit percent on a per-user basis within the queue.
> This functionality is needed so that when intra-queue preemption is enabled
> (YARN-4945 / YARN-2113), some users can be deemed as more important than
> other users, and resources from VIP users won't be as likely to be preempted.
> For example, if the {{getstuffdone}} queue has a MULP of 25 percent, but user
> {{jane}} is a power user of queue {{getstuffdone}} and needs to be guaranteed
> 75 percent, the properties for {{getstuffdone}} and {{jane}} would look like
> this:
> {code}
> <property>
>
> <name>yarn.scheduler.capacity.root.getstuffdone.minimum-user-limit-percent</name>
> <value>25</value>
> </property>
> <property>
>
> <name>yarn.scheduler.capacity.root.getstuffdone.jane.minimum-user-limit-percent</name>
> <value>75</value>
> </property>
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.15#6346)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]