[
https://issues.apache.org/jira/browse/YARN-4606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16425736#comment-16425736
]
Eric Payne commented on YARN-4606:
----------------------------------
bq. could you briefly summary what is the current issue and solution being
discussed?
[~leftnoteasy], the latest patch ({{YARN-4606.POC.patch}}) changed the behavior
of the capacity scheduler so that it would never give a container to the second
app for its AM as long as the first app consumed the entire queue and had
pending requests, even when the AM used is lower than AM max. I described it in
more detail
[above|https://issues.apache.org/jira/browse/YARN-4606?focusedCommentId=16391802&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16391802].
[I suggested that one
solution|https://issues.apache.org/jira/browse/YARN-4606?focusedCommentId=16396094&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-16396094]
would be to modify the code as follows as long as there is a way to do it in
an abstract way:
{code:title=AppSchedulingInfo#updatePendingResources}
if( Not Waiting For AM Container
|| (Queue Used AM Resources < Queue Max AM Resources) {
abstractUsersManager.activateApplication(user, applicationId);
}
{code}
I suggested a way to do that, but it seems a little cumbersome.
So then I started wondering if there was a way to leverage the {{Schedulable
Apps}} and {{Non-Schedulable Apps}} user info in the
{{AppSchedulingInfo#updatePendingResources}} code. I looked more closely,
however, and it is too early within
{{AppSchedulingInfo#updatePendingResources}} to tell whether or not a new app
is destined to be schedulable.
So, I think the best suggestion I have is the pseudo-code I posted above.
> CapacityScheduler: applications could get starved because computation of
> #activeUsers considers pending apps
> -------------------------------------------------------------------------------------------------------------
>
> Key: YARN-4606
> URL: https://issues.apache.org/jira/browse/YARN-4606
> Project: Hadoop YARN
> Issue Type: Bug
> Components: capacity scheduler, capacityscheduler
> Affects Versions: 2.8.0, 2.7.1
> Reporter: Karam Singh
> Assignee: Wangda Tan
> Priority: Critical
> Attachments: YARN-4606.1.poc.patch, YARN-4606.POC.patch
>
>
> Currently, if all applications belong to same user in LeafQueue are pending
> (caused by max-am-percent, etc.), ActiveUsersManager still considers the user
> is an active user. This could lead to starvation of active applications, for
> example:
> - App1(belongs to user1)/app2(belongs to user2) are active, app3(belongs to
> user3)/app4(belongs to user4) are pending
> - ActiveUsersManager returns #active-users=4
> - However, there're only two users (user1/user2) are able to allocate new
> resources. So computed user-limit-resource could be lower than expected.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]