[
https://issues.apache.org/jira/browse/YARN-3769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Eric Payne updated YARN-3769:
-----------------------------
Attachment: YARN-3769.004.patch
[~leftnoteasy], Thank you for your review, and sorry for the late reply.
{quote}
- Why this is needed? MAX_PENDING_OVER_CAPACITY. I think this could be
problematic, for example, if a queue has capacity = 50, and it's usage is 10
and it has 45 pending resource, if we set MAX_PENDING_OVER_CAPACITY=0.1, the
queue cannot preempt resource from other queue.
{quote}
Sorry for the poor naming convention. It is not really being used to check
against the queue's capacity, it is used to check for a percentage over the
currently used resources. I changed the name to {{MAX_PENDING_OVER_CURRENT}}.
As you know, there are multiple reasons why preemption could unnecessarily
preempt resources (I call it "flapping"). Only one of which is the lack of
consideration for user limit factor. Another is that an app could be requesting
an 8-gig container, and the preemption monitor could conceivably preempt 8,
one-gig containers, which would then be rejected by the requesting AM and
potentially given right back to the preempted app.
The {{MAX_PENDING_OVER_CURRENT}} buffer is an attempt to alleviate that
particular flapping situation by giving a buffer zone above the currently used
resources on a particular queue. This is to say that the preemption monitor
shouldn't consider that queue B is asking for pending resources unless pending
resources on queue B are above a configured percentage of currently used
resources on queue B.
If you want, we can pull this out and put it as part of a different JIRA so we
can document and discuss that particular flapping situation separately.
{quote}
- n LeafQueue, it uses getHeadroom() to compute how many resource that the user
can use. But I think it may not correct: ... For above queue status, headroom
for a.a1 is 0 since queue-a's currentResourceLimit is 0.
So instead of using headroom, I think you can use computed-user-limit -
user.usage(partition) as the headroom. You don't need to consider queue's max
capacity here, since we will consider queue's max capacity at following logic
of PCPP.
{quote}
Yes, you are correct. {{getHeadroom}} could be calculating zero headroom when
we don't want it to. And, I agree that we don't need to limit pending resources
to max queue capacity when calculating pending resources. The concern for this
fix is that user limit factor should be considered and limit the pending value.
The max queue capacity will be considered during the offer stage of the
preemption calculations.
> Preemption occurring unnecessarily because preemption doesn't consider user
> limit
> ---------------------------------------------------------------------------------
>
> Key: YARN-3769
> URL: https://issues.apache.org/jira/browse/YARN-3769
> Project: Hadoop YARN
> Issue Type: Bug
> Components: capacityscheduler
> Affects Versions: 2.6.0, 2.7.0, 2.8.0
> Reporter: Eric Payne
> Assignee: Eric Payne
> Attachments: YARN-3769-branch-2.002.patch,
> YARN-3769-branch-2.7.002.patch, YARN-3769-branch-2.7.003.patch,
> YARN-3769.001.branch-2.7.patch, YARN-3769.001.branch-2.8.patch,
> YARN-3769.003.patch, YARN-3769.004.patch
>
>
> We are seeing the preemption monitor preempting containers from queue A and
> then seeing the capacity scheduler giving them immediately back to queue A.
> This happens quite often and causes a lot of churn.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)