[ 
https://issues.apache.org/jira/browse/YARN-3769?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Payne updated YARN-3769:
-----------------------------
    Attachment: YARN-3769.004.patch

[~leftnoteasy], Thank you for your review, and sorry for the late reply.

{quote}
- Why this is needed? MAX_PENDING_OVER_CAPACITY. I think this could be 
problematic, for example, if a queue has capacity = 50, and it's usage is 10 
and it has 45 pending resource, if we set MAX_PENDING_OVER_CAPACITY=0.1, the 
queue cannot preempt resource from other queue.
{quote}
Sorry for the poor naming convention. It is not really being used to check 
against the queue's capacity, it is used to check for a percentage over the 
currently used resources. I changed the name to {{MAX_PENDING_OVER_CURRENT}}.

As you know, there are multiple reasons why preemption could unnecessarily 
preempt resources (I call it "flapping"). Only one of which is the lack of 
consideration for user limit factor. Another is that an app could be requesting 
an 8-gig container, and the preemption monitor could conceivably preempt 8, 
one-gig containers, which would then be rejected by the requesting AM and 
potentially given right back to the preempted app.

The {{MAX_PENDING_OVER_CURRENT}} buffer is an attempt to alleviate that 
particular flapping situation by giving a buffer zone above the currently used 
resources on a particular queue. This is to say that the preemption monitor 
shouldn't consider that queue B is asking for pending resources unless pending 
resources on queue B are above a configured percentage of currently used 
resources on queue B.

If you want, we can pull this out and put it as part of a different JIRA so we 
can document and discuss that particular flapping situation separately.

{quote}
- n LeafQueue, it uses getHeadroom() to compute how many resource that the user 
can use. But I think it may not correct: ... For above queue status, headroom 
for a.a1 is 0 since queue-a's currentResourceLimit is 0.
So instead of using headroom, I think you can use computed-user-limit - 
user.usage(partition) as the headroom. You don't need to consider queue's max 
capacity here, since we will consider queue's max capacity at following logic 
of PCPP.
{quote}
Yes, you are correct. {{getHeadroom}} could be calculating zero headroom when 
we don't want it to. And, I agree that we don't need to limit pending resources 
to max queue capacity when calculating pending resources. The concern for this 
fix is that user limit factor should be considered and limit the pending value. 
The max queue capacity will be considered during the offer stage of the 
preemption calculations.

> Preemption occurring unnecessarily because preemption doesn't consider user 
> limit
> ---------------------------------------------------------------------------------
>
>                 Key: YARN-3769
>                 URL: https://issues.apache.org/jira/browse/YARN-3769
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: capacityscheduler
>    Affects Versions: 2.6.0, 2.7.0, 2.8.0
>            Reporter: Eric Payne
>            Assignee: Eric Payne
>         Attachments: YARN-3769-branch-2.002.patch, 
> YARN-3769-branch-2.7.002.patch, YARN-3769-branch-2.7.003.patch, 
> YARN-3769.001.branch-2.7.patch, YARN-3769.001.branch-2.8.patch, 
> YARN-3769.003.patch, YARN-3769.004.patch
>
>
> We are seeing the preemption monitor preempting containers from queue A and 
> then seeing the capacity scheduler giving them immediately back to queue A. 
> This happens quite often and causes a lot of churn.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to