[
https://issues.apache.org/jira/browse/YARN-5889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15714764#comment-15714764
]
Sunil G commented on YARN-5889:
-------------------------------
Yes [~eepayne], I understood your view here.
However in ideal cases, we might need to compute user limit in allocation
thread if there are more allocations happened in prior heartbeat or some
release container happened between two heartbeats. This means that we will be
doing same as what we do earlier too with some minor improvements in a busy
cluster (I agree that normal clusters, we can see some improvement). Ideally
when we tested with SLS, user-compute-limit was done under writelock and was
consuming good amount of time.
If we are taking user-limit computation out of allocation thread, we have some
good advantages:
- Unblocking allocation from computing user-limit
- Giving a read-only user-limit for other modules such as preemption
(user-limit/priority etc)
- Such a user thread running from a user manager will be easier to maintain.
- Still this is configuration driven, hence user can know the minor limitations
and choose to get more performance.
As I see now, there is only one case by which scheduler may get an older limit.
- Container release/allocation happened
- CS placed a push-to-recompute-user-limit flag to ComputeUserLimitAsyncThread
or Manager.
- ComputeUserLimitAsyncThread is computing the limit and is in that process.
Yet to publish
- At same time, another allocation thread used old data to do one allocation.
I will now do some SLS tests with and without allocation thread and the
suggested improvements. So we can also see the performance improvements over
both.
> Improve user-limit calculation in capacity scheduler
> ----------------------------------------------------
>
> Key: YARN-5889
> URL: https://issues.apache.org/jira/browse/YARN-5889
> Project: Hadoop YARN
> Issue Type: Bug
> Components: capacity scheduler
> Reporter: Sunil G
> Assignee: Sunil G
> Attachments: YARN-5889.v0.patch, YARN-5889.v1.patch,
> YARN-5889.v2.patch
>
>
> Currently user-limit is computed during every heartbeat allocation cycle with
> a write lock. To improve performance, this tickets is focussing on moving
> user-limit calculation out of heartbeat allocation flow.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]