[ 
https://issues.apache.org/jira/browse/YARN-5889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15847719#comment-15847719
 ] 

Wangda Tan commented on YARN-5889:
----------------------------------

Major comments of UsersManager:
1) Remove UM#incResourceUsagePerUser, instead LQ should use 
updateUserResourceUsageDuringAllocate to make queue ratio get updated as well.

2) Instead of storing user limit for every user/partition/scheduling-mode, we 
should only store it for every partition and scheduling-mode since now we have 
an identical UL for all users in active set and total set. 

3) Instead of storing local version inside user, we should store it for every 
partition/scheduling-mode to make sure we can always get up-to-date UL. And 
also, I think we may need two maps to store active-user/all-users local version.

4) Headroom of user can be fixed in a separate patch with lower priority: now 
headroom is set to queue-capacity * MULP, and it will slowly increase with more 
container allocation.

Minor comments for UsersManager
5) public to private: 
updateQueueUsageRatio/getUsers/getLatestVersionOfUsersState

6) Merge implementation of 
updateUserResourceUsageDuringAllocate/updateUserResourceUsageDuringRelease.

7) resourceCalculator can be stored in UsersManager, which we can avoid passing 
RC in all parameters.

8) getTotalResourceUsagePerUser, instead of using nested {{? .. : ..}}, suggest 
to use if..else

9) UM#UsageRatios: not caused by your patch: there're some necessary boxing 
operation. Better to update inc/setUsageRatio, all new Float is not required.

10) Following method is too simple to be a separated method: 
removeFromUsersSet/removeFromUsersSet

11) Need writeLock: UM#removeUser, 

12) isRecomputeNeeded: if you agree with #3, I think the parameter should be: 
partition/schedulingMode only. Map<SchedulingMode, Resource> is not required 
since if partition=x and scheduling-mode=y exists, the value of resource should 
not null, correct?

13) getComputedResourceLimitForAll/ActiveUsers, I think the logic can be 
simplified to:
{code}
writelock {
  if (isRecomputationNeede(...)) {
      // ...
  }
}
return map.get(scheduling_mode, partition); 
{code}

14) {{userLimitPerSchedulingMode.clear()}} can be removed from 
{{reComputeUserLimits}} if you agree with #12.

15) {{Resource resourceByLabel}} can be removed from parameter list of 
{{updateUserResourceUsageDuringAllocate}}

16) calculateUserUsageRatio could be removed and all logics can be done inside 
setUsageRatio which wrapped by writelock. And rename {{setUsageRatio}} to 
{{updateUsageRatio}}

17) rename updateQueueUsageRatio to incQueueUsageRatio

18) (TODO) Please review all usage of getUserAndAddIfAbsent, I think some of 
them might not necessary.

> Improve user-limit calculation in capacity scheduler
> ----------------------------------------------------
>
>                 Key: YARN-5889
>                 URL: https://issues.apache.org/jira/browse/YARN-5889
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: capacity scheduler
>            Reporter: Sunil G
>            Assignee: Sunil G
>         Attachments: YARN-5889.0001.patch, 
> YARN-5889.0001.suggested.patchnotes, YARN-5889.0002.patch, 
> YARN-5889.0003.patch, YARN-5889.0004.patch, YARN-5889.0005.patch, 
> YARN-5889.0006.patch, YARN-5889.0007.patch, YARN-5889.v0.patch, 
> YARN-5889.v1.patch, YARN-5889.v2.patch
>
>
> Currently user-limit is computed during every heartbeat allocation cycle with 
> a write lock. To improve performance, this tickets is focussing on moving 
> user-limit calculation out of heartbeat allocation flow.



--
This message was sent by Atlassian JIRA
(v6.3.15#6346)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to