[
https://issues.apache.org/jira/browse/YARN-5864?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15655514#comment-15655514
]
Tan, Wangda commented on YARN-5864:
-----------------------------------
Thanks [~curino] for sharing these insightful suggestions.
The problem you mentioned is totally true: we were putting lots of efforts to
add features for various of resource constraints (such as limits, node
partition, priority, etc.) but we paid less attention about how to make
easier/consistent semantics.
I also agree that we do need to spend some time to think about what is the
semantics that YARN scheduler should have. For example, the minimum guarantee
of CS is queue should get at least their configured capacity, but a picky app
could make an under-utilized queue waiting forever for the resource. And also
as you mentioned above, non-preemptable queue can invalidate configured
capacity as well.
However, I would argue that the scheduler is not able to run perfectly without
invalidating all the constraints. It is not just a group of formulas we need to
define and let the solver to optimize it, it involves lots of human's emotions
and preferences. For example, user may not understand and glad to accept why a
picky request cannot be allocated even if the queue/cluster have available
capacity. And it may not be acceptable to a production cluster that a long
running service for realtime queries cannot be launched because we don't want
to kill some less-important batch jobs. My point is, if we can have these rules
defined in the doc and user can know what happened from the UI/log, we can add
them.
To improve these, I think your suggestion (1) will be more helpful and
achievable in a short term, we can definitely remove some parameters, for
example, existing user-limit definition is not good enough and
user-limit-factor can always make a queue cannot fully utilize its capacity.
And we can better define these semantics in doc and UI.
(2) Looks beautiful but it may not be able to solve the root problem directly:
The first priority is to make our users feel happy to accept it instead of
beautifully solving it in mathematics. For example, for the problem I put in
description of the JIRA, I don't think (2) can get allocation without harming
other applications. And in implementation's perspective, I'm not sure how to
make a solver-based solution can handle both of fast allocation (we want to do
allocation within milli-seconds for interactive queries) and good placement
(such as gang scheduling with some other constraints like anti-affinity). It
seems to me that we will sacrifice low latency to get better quality of
placement for the option (2).
bq. This opens up many abuses, one that comes to mind ...
Actually this feature will be only used in a pretty controlled environment:
Important long running services running in a separate queue, and admin/user
agrees that it can preempt other batch jobs to get new containers. ACLs will be
set to avoid normal user running inside these queues, all apps running in the
queue should be trusted apps such as YARN native services (Slider), Spark, etc.
And we can also make sure these apps will try best to respect other apps.
And please advice if you think we can improve the semantics of this feature.
Thanks,
> Capacity Scheduler preemption for fragmented cluster
> -----------------------------------------------------
>
> Key: YARN-5864
> URL: https://issues.apache.org/jira/browse/YARN-5864
> Project: Hadoop YARN
> Issue Type: New Feature
> Reporter: Wangda Tan
> Assignee: Wangda Tan
> Attachments: YARN-5864.poc-0.patch
>
>
> YARN-4390 added preemption for reserved container. However, we found one case
> that large container cannot be allocated even if all queues are under their
> limit.
> For example, we have:
> {code}
> Two queues, a and b, capacity 50:50
> Two nodes: n1 and n2, each of them have 50 resource
> Now queue-a uses 10 on n1 and 10 on n2
> queue-b asks for one single container with resource=45.
> {code}
> The container could be reserved on any of the host, but no preemption will
> happen because all queues are under their limits.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]