Jason Lowe commented on YARN-5889:

bq. To solve the problem, we need to compute user limit considering non-active 
users. If a non-active user uses less than minimum-user-limit, we can continue 
distribute its available quotas to other active users; in the other hand, if a 
non-active user uses more than minimum-user-limit, we could also get resource 
from the user.

As I understand it, for allocation purposes we need to compute the user limit 
where the number of users is the number of users who have at least one 
application that is requesting resources.  For preemption purposes, we need to 
compute the user limit where the number of users is the total number of users 
who have at least one app in the queue (whether they have apps that are 
requesting or not).

bq. This computation is more expensive, it should be O(N), N is number of 
applications in the queue.

I don't see how this is related to the number of applications.  If you look at 
how the user limit is calculated, there are no terms in that calculation that 
have anything to do with how many apps the user has.  Am I missing something?  
If so, maybe an example where number of applications or order of applications 
would help clarify.

I still think a simple flag to indicate a user limit needs to be recomputed 
would go a long way here.  We are already tracking total resources associated 
with each user in each queue in the User structure and caching the user limit 
there as well.  We could add a flag to indicate the cached user limit needs to 
be recalculated, and when we go to get the cached value it can on-the-fly 
recalculate it if it is dirty.  The calculation would only become dirty if one 
of the following events occur:
- An allocation is assigned to one of the user's apps
- A container associated with one of the user's apps completes
- A user becomes active in the queue (i.e.: now has apps requesting resources)
- A user becomes inactive in the queue (i.e.: no longer has apps requesting 
- The queue capacity changes (e.g.: nodes added/removed)
- The queue settings are refreshed

We can speed up the last four by having a queue-level sequence number that is 
incremented when active users or queue changes.  The User structure can cache 
the sequence number used when the user limit is recalculated and compare with 
the queue sequence number to know a user's limit needs recalculating.  This can 
also become the user-level flag when containers are allocated/completed by 
setting the cached sequence number in the User structure to the queue's 
sequence number minus one, which will force the sequence numbers to mismatch 
and cause a recalculation when the user limit is requested.

The idea here is that we will only calculate the user limit lazily and only 
when absolutely necessary.  This will be much faster than what we do today and 
not require asynchronous computation that breaks the constraints of the 
scheduler.  For purposes of doing preemptions, we can have the same concept 
used for a separate cached user limit, one that considers all users instead of 
only active ones.

When it comes time to calculate preemptions (which are only necessary when the 
queue cannot get more resources), we can sort the users by how far they are 
beyond their all-users user limit (either in absolute terms or by percentage).  
This may or may not require a computation depending upon whether the cached 
value is out of date.  Then we can walk down the list starting with the user 
most past their limit.  We can stop traversing when we either preempt enough 
resources or we get to a user that is below their limit.

> Improve user-limit calculation in capacity scheduler
> ----------------------------------------------------
>                 Key: YARN-5889
>                 URL: https://issues.apache.org/jira/browse/YARN-5889
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: capacity scheduler
>            Reporter: Sunil G
>            Assignee: Sunil G
>         Attachments: YARN-5889.v0.patch, YARN-5889.v1.patch, 
> YARN-5889.v2.patch
> Currently user-limit is computed during every heartbeat allocation cycle with 
> a write lock. To improve performance, this tickets is focussing on moving 
> user-limit calculation out of heartbeat allocation flow.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to