[
https://issues.apache.org/jira/browse/YARN-4606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15108196#comment-15108196
]
Wangda Tan commented on YARN-4606:
----------------------------------
Proposed solution:
We should only consider a user is "active" if any of its application is active.
And CS will use the "#active-user-which-has-at-least-one-active-app" to compute
user-limit.
Computation of max-am-resource-per-user needs to be updated as well. We should
get a #users-which-has-pending-apps to compute max-am-resource-per-user.
This looks like a major behavior change to existing scheduler logic. Thoughts?
[~vinodkv]/[~jlowe]/[~jianhe].
I'm not sure if FairScheduler needs similar changes as well, if a user in
FSLeafQueue doesn't have any runnable apps, should we increase #active-users of
QueueMetrics?
> CapacityScheduler: applications could get starved because computation of
> #activeUsers considers pending apps
> -------------------------------------------------------------------------------------------------------------
>
> Key: YARN-4606
> URL: https://issues.apache.org/jira/browse/YARN-4606
> Project: Hadoop YARN
> Issue Type: Bug
> Components: capacity scheduler, capacityscheduler
> Affects Versions: 2.8.0, 2.7.1
> Reporter: Karam Singh
> Assignee: Wangda Tan
> Priority: Critical
>
> Currently, if all applications belong to same user in LeafQueue are pending
> (caused by max-am-percent, etc.), ActiveUsersManager still considers the user
> is an active user. This could lead to starvation of active applications, for
> example:
> - App1(belongs to user1)/app2(belongs to user2) are active, app3(belongs to
> user3)/app4(belongs to user4) are pending
> - ActiveUsersManager returns #active-users=4
> - However, there're only two users (user1/user2) are able to allocate new
> resources. So computed user-limit-resource could be lower than expected.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)