[ 
https://issues.apache.org/jira/browse/YARN-4606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16453297#comment-16453297
 ] 

Wangda Tan commented on YARN-4606:
----------------------------------

Thanks [~eepayne] / [~maniraj...@gmail.com],

Here's my understanding of the proposed approach: 

1) When we compute {{max-am-resource-per-user}}, we uses #active-users + 
#pending-users.
2) When we compute {{max-user-limit}}, we use #active-users only. 

To me this is correct and (seems) same as what I proposed previously:
{code}
We should only consider a user is "active" if any of its application is active. 
And CS will use the "#active-user-which-has-at-least-one-active-app" to compute 
user-limit.

Computation of max-am-resource-per-user needs to be updated as well. We should 
get a #users-which-has-pending-apps to compute max-am-resource-per-user.
{code}

I haven't checked very much details of the patch since [~maniraj...@gmail.com] 
is working on update the tests, etc. Just one suggestion is: AppSchedulingInfo 
is supports to cache status for pending resource, it might be better to avoid 
invoking SchedulerAppAttempt's method from AppSchedulingInfo.

> CapacityScheduler: applications could get starved because computation of 
> #activeUsers considers pending apps 
> -------------------------------------------------------------------------------------------------------------
>
>                 Key: YARN-4606
>                 URL: https://issues.apache.org/jira/browse/YARN-4606
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: capacity scheduler, capacityscheduler
>    Affects Versions: 2.8.0, 2.7.1
>            Reporter: Karam Singh
>            Assignee: Manikandan R
>            Priority: Critical
>         Attachments: YARN-4606.1.poc.patch, YARN-4606.POC.2.patch, 
> YARN-4606.POC.patch
>
>
> Currently, if all applications belong to same user in LeafQueue are pending 
> (caused by max-am-percent, etc.), ActiveUsersManager still considers the user 
> is an active user. This could lead to starvation of active applications, for 
> example:
> - App1(belongs to user1)/app2(belongs to user2) are active, app3(belongs to 
> user3)/app4(belongs to user4) are pending
> - ActiveUsersManager returns #active-users=4
> - However, there're only two users (user1/user2) are able to allocate new 
> resources. So computed user-limit-resource could be lower than expected.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to