[ 
https://issues.apache.org/jira/browse/YARN-4606?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16345536#comment-16345536
 ] 

Manikandan R commented on YARN-4606:
------------------------------------

[~leftnoteasy] Based on my understanding, as of now 
{{ActiveUsersManager#getNumActiveUsers}} has been used only in 
{{LeafQueue#getUserAMResourceLimitPerPartition}} to compute the userAMLimit. 
Given this, ensuring active users count increments only after app activation is 
sufficient to fix this JIRA? Something like,

 1. Introduce a new method {{activeApplication()}} in {{AppSchedulingInfo}} to 
set new private boolean variable "isActivated" to true.
 2. Call #1 from {{LeafQueue#activateApplications}} after 
{{application.updateAMContainerDiagnostics(AMState.ACTIVATED, null);}} similar 
to changes in earlier POC patch.
 3. Ensure {{abstractUsersManager.activateApplication(user, applicationId);}} 
in {{AppSchedulingInfo#updatePendingResources}} executes only when isActivate 
is true.

Please share your views.

> CapacityScheduler: applications could get starved because computation of 
> #activeUsers considers pending apps 
> -------------------------------------------------------------------------------------------------------------
>
>                 Key: YARN-4606
>                 URL: https://issues.apache.org/jira/browse/YARN-4606
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: capacity scheduler, capacityscheduler
>    Affects Versions: 2.8.0, 2.7.1
>            Reporter: Karam Singh
>            Assignee: Wangda Tan
>            Priority: Critical
>         Attachments: YARN-4606.1.poc.patch
>
>
> Currently, if all applications belong to same user in LeafQueue are pending 
> (caused by max-am-percent, etc.), ActiveUsersManager still considers the user 
> is an active user. This could lead to starvation of active applications, for 
> example:
> - App1(belongs to user1)/app2(belongs to user2) are active, app3(belongs to 
> user3)/app4(belongs to user4) are pending
> - ActiveUsersManager returns #active-users=4
> - However, there're only two users (user1/user2) are able to allocate new 
> resources. So computed user-limit-resource could be lower than expected.



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to