[ 
https://issues.apache.org/jira/browse/YARN-5889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15714764#comment-15714764
 ] 

Sunil G commented on YARN-5889:
-------------------------------

Yes [~eepayne], I understood your view here.

However in ideal cases, we might need to compute user limit in allocation 
thread if there are more allocations happened in prior heartbeat or some 
release container happened between two heartbeats. This means that we will be 
doing same as what we do earlier too with some minor improvements in a busy 
cluster (I agree that normal clusters, we can see some improvement). Ideally 
when we tested with SLS, user-compute-limit was done under writelock and was 
consuming good amount of time.

If we are taking user-limit computation out of allocation thread, we have some 
good advantages:
- Unblocking allocation from computing user-limit
- Giving a read-only user-limit for other modules such as preemption 
(user-limit/priority etc)
- Such a user thread running from a user manager will be easier to maintain.
- Still this is configuration driven, hence user can know the minor limitations 
and choose to get more performance.

As I see now, there is only one case by which scheduler may get an older limit. 
- Container release/allocation happened
- CS placed a push-to-recompute-user-limit flag to ComputeUserLimitAsyncThread 
or Manager.
- ComputeUserLimitAsyncThread is computing the limit and is in that process. 
Yet to publish
- At same time, another allocation thread used old data to do one allocation.

I will now do some SLS tests with and without allocation thread and the 
suggested improvements. So we can also see the performance improvements over 
both.

> Improve user-limit calculation in capacity scheduler
> ----------------------------------------------------
>
>                 Key: YARN-5889
>                 URL: https://issues.apache.org/jira/browse/YARN-5889
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: capacity scheduler
>            Reporter: Sunil G
>            Assignee: Sunil G
>         Attachments: YARN-5889.v0.patch, YARN-5889.v1.patch, 
> YARN-5889.v2.patch
>
>
> Currently user-limit is computed during every heartbeat allocation cycle with 
> a write lock. To improve performance, this tickets is focussing on moving 
> user-limit calculation out of heartbeat allocation flow.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to