[ 
https://issues.apache.org/jira/browse/YARN-10009?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16986411#comment-16986411
 ] 

Eric Payne edited comment on YARN-10009 at 12/4/19 6:12 PM:
------------------------------------------------------------

The root cause is here:
{code:title=UsersManager#computeUserLimit}
    /*
     * User limit resource is determined by: max(currentCapacity / #activeUsers,
     * currentCapacity * user-limit-percentage%)
     */
    Resource userLimitResource = Resources.max(resourceCalculator,
        partitionResource,
        Resources.divideAndCeil(resourceCalculator, resourceUsed,
            usersSummedByWeight),
        Resources.divideAndCeil(resourceCalculator,
            Resources.multiplyAndRoundDown(currentCapacity, getUserLimit()),
            100));
{code}
When calculating the user resource limit, {{divideAndCeil}} is used to take the 
max of either (queue capacity / # of active users) or (queue capacity / min 
user limit pct). However, they are not the same divideAndCeil methods. The 
first takes a {{Resource}} and a {{float}} and the second takes a {{Resource}} 
and an {{int}}. The method with the {{Resource}} {{float}} signature was never 
updated to handle custom resources.

The only place that calls {{difideAndCeil(Resource, float)}} is here in 
{{UsersManager#computeUserLimit}}


was (Author: eepayne):
The root cause is here:
{code:UsersManager#computeUserLimit}
    /*
     * User limit resource is determined by: max(currentCapacity / #activeUsers,
     * currentCapacity * user-limit-percentage%)
     */
    Resource userLimitResource = Resources.max(resourceCalculator,
        partitionResource,
        Resources.divideAndCeil(resourceCalculator, resourceUsed,
            usersSummedByWeight),
        Resources.divideAndCeil(resourceCalculator,
            Resources.multiplyAndRoundDown(currentCapacity, getUserLimit()),
            100));
{code}
When calculating the user resource limit, {{divideAndCeil}} is used to take the 
max of either (queue capacity / # of active users) or (queue capacity / min 
user limit pct). However, they are not the same divideAndCeil methods. The 
first takes a {{Resource}} and a {{float}} and the second takes a {{Resource}} 
and an {{int}}. The method with the {{Resource}} {{float}} signature was never 
updated to handle custom resources.

The only place that calls {{difideAndCeil(Resource, float)}} is here in 
{{UsersManager#computeUserLimit}}

> In Capacity Scheduler, DRC can treat minimum user limit percent as a max when 
> custom resource is defined
> --------------------------------------------------------------------------------------------------------
>
>                 Key: YARN-10009
>                 URL: https://issues.apache.org/jira/browse/YARN-10009
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: capacity scheduler
>    Affects Versions: 2.10.0, 3.3.0, 3.2.1, 3.1.3, 2.11.0
>            Reporter: Eric Payne
>            Assignee: Eric Payne
>            Priority: Major
>         Attachments: YARN-10009.001.patch, YARN-10009.UT.patch
>
>
> | |Memory|Vcores|res_1|
> |Queue1 Totals|20GB|100|80|
> |Resources requested by App1 in Queue1|8GB (40% of total)|8 (8% of total)|80 
> (100% of total)|
> In the previous use case:
>  - Queue1 has a value of 25 for {{miminum-user-limit-percent}}
>  - User1 has requested 8 containers with {{<memory:1GB, vcores:8, res_1:10>}} 
> each
>  - {{res_1}} will be the dominant resource this case.
> All 8 containers should be assigned by the capacity scheduler, but with min 
> user limit pct set to 25, only 2 containers are assigned.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to