[
https://issues.apache.org/jira/browse/YARN-5889?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15730055#comment-15730055
]
Wangda Tan commented on YARN-5889:
----------------------------------
Thanks [~sunilg] for working on the patch and suggestions from [~jlowe],
[~eepayne].
Personally I think the title and desc are a little confusing.
First of all, I think the most important target of this JIRA is not improving
performance. It is to make user-limit preemption correct. Currently we compute
an unique user-limit value for each leaf queue, this is enough for allocation
but not enough for preemption. Here is an example.
A queue has cap=max-cap=100, min-user-limit-percent=50, user-limit-factor=1, at
time T, there're 2 users using resources:
{code}
u1.used = 75, u2.used = 25
{code}
Only u2 is active user,
According to existing user limit computation:
{code}
user_limit =
round_up(
min(
max(current_capacity / #active_user,
current_capacity * user_limit_percent),
queue_capacity * user_limit_factor)),
minimum_allocation)
{code}
Computed user-limit=100, more than any user's usage, so there's nothing will be
preempted.
We can give many other examples like:
{code}
minimum-user-limit-percent = 33
3 users:
u1.used = 50, u2.used = 20, u3.used = 30
u2/u3 are active users
{code}
The computed user-limit = 50, which makes preemption cannot kick in.
This problem could happen when #active-user < #total-user. The problem is, at
the allocation stage, we only need check active users. But in preemption, we
need to preempt resource from non-active users.
To solve the problem, we need to compute user limit considering non-active
users. If a non-active user uses less than minimum-user-limit, we can continue
distribute its available quotas to other active users; in the other hand, if a
non-active user uses more than minimum-user-limit, we could also get resource
from the user. This computation is more expensive, it should be O(N), N is
number of applications in the queue.
That is why we need an async thread to do all these stuffs: we cannot put a
computation which is O(N) to allocation thread. To me, the common things
between computation of (actual) user-limit and fair share (FS) are:
- They're all too expensive to do when checking every application.
- They're all instant limit, no user should understand the computed instant
limit. The instant limit and usage could keep changing, but it will converge to
a balance over a period of time.
I haven't checked patch implantation yet. Please let us know your thoughts
about the overall points. I don't want to make this change to block user-limit
preemption effort too, so it will be more helpful if you could share ideas
about how we can achieve user-limit preemption without the async thread
approach.
Thanks,
> Improve user-limit calculation in capacity scheduler
> ----------------------------------------------------
>
> Key: YARN-5889
> URL: https://issues.apache.org/jira/browse/YARN-5889
> Project: Hadoop YARN
> Issue Type: Bug
> Components: capacity scheduler
> Reporter: Sunil G
> Assignee: Sunil G
> Attachments: YARN-5889.v0.patch, YARN-5889.v1.patch,
> YARN-5889.v2.patch
>
>
> Currently user-limit is computed during every heartbeat allocation cycle with
> a write lock. To improve performance, this tickets is focussing on moving
> user-limit calculation out of heartbeat allocation flow.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]