Wangda Tan commented on YARN-3769:

[~eepayne], Thanks for update.

bq. If you want, we can pull this out and put it as part of a different JIRA so 
we can document and discuss that particular flapping situation separately.
I would prefer to make it to be a separate JIRA, since it is a not directly 
related fix. Will review PCPP after you separate those changes (since you're OK 
with making it separated :))

bq. Yes, you are correct. getHeadroom could be calculating zero headroom when 
we don't want it to. And, I agree that we don't need to limit pending resources 
to max queue capacity when calculating pending resources. The concern for this 
fix is that user limit factor should be considered and limit the pending value. 
The max queue capacity will be considered during the offer stage of the 
preemption calculations.

I agree with your existing appoarch, user-limit should be capped by max queue 
capacity as well.

One nit for LeafQueue changes:
1534        minPendingAndPreemptable =
1535            Resources.componentwiseMax(Resources.none(),
1536                Resources.subtract(
1537                    userNameToHeadroom.get(userName), 

you don't need to do componmentwiseMax here, since minPendingAndPreemptable <= 
headroom, and you can use substractFrom to make code simpler.

> Preemption occurring unnecessarily because preemption doesn't consider user 
> limit
> ---------------------------------------------------------------------------------
>                 Key: YARN-3769
>                 URL: https://issues.apache.org/jira/browse/YARN-3769
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: capacityscheduler
>    Affects Versions: 2.6.0, 2.7.0, 2.8.0
>            Reporter: Eric Payne
>            Assignee: Eric Payne
>         Attachments: YARN-3769-branch-2.002.patch, 
> YARN-3769-branch-2.7.002.patch, YARN-3769-branch-2.7.003.patch, 
> YARN-3769.001.branch-2.7.patch, YARN-3769.001.branch-2.8.patch, 
> YARN-3769.003.patch, YARN-3769.004.patch
> We are seeing the preemption monitor preempting containers from queue A and 
> then seeing the capacity scheduler giving them immediately back to queue A. 
> This happens quite often and causes a lot of churn.

This message was sent by Atlassian JIRA

Reply via email to