Eric Payne commented on YARN-8020:

[~kyungwan nam], on what version of YARN are you seeing this problem? My 
experience with DRF is different than is described above. I have investigated 
this on both 2.8 and 3.2 snapshot builds.

We are using the DRF calculator in large preemptable queues with various sizes 
of containers using both large memory or large vcores or both. Cross-queue 
preemption seems to be working well in general. I do see a corner case, but 
first I want to address your above comments.

bq. as a result, idealAssigned will be <Memory:-81GB, VCores:19>, which does 
not trigger preemption.
If one of the elements in the idealAssigned Resource is 0 or less than 0, 
preemption will not occur. This is so that preemption won't bring the queue too 
far below its guarantee for one of the elements. Having said that, it will 
preempt to a large extent even if it brings one of the elements below its 
guarantee, but if one of them goes to 0 or below in the idealAssigned Resource, 
it will stop preempting.

bq. avail: <Memory:181GB, Vcores:1>
Cross-queue preemption will not preempt if there are available resources in the 
cluster or queue. It depends on how many resources are being requested by the 
other queue, but even with 1 available vcore, preemption may choose not to 
preempt in this case as well.

Now on to my corner case.

I do not see a problem using DRF if the containers in the preemptable queue 
have a larger Resource element and the containers in the asking queue have 
smaller Resource elements. For example, it seems to work fine if Resources in 
the preemptable queue is using <Memory:1GB, VCores:10> containers and the 
asking queue is using smaller containers, for example <Memory:1GB, Vcores:3> 

The place where it seems to get stuck is when the containers in the preemptable 
queue are using one or more smaller Resource elements than the containers in 
the asking queue. For example, it will sometimes not preempt if the preemptable 
queue has containers using <Memory:0.5GB, VCores:1> and the asking queue queue 
has containers using <Memory:0.5GB, VCores:2>.

Even in the latter case, preemption will sometimes still occur, depending on 
the ratio of the sizes of each element to the ones in the ohter queue.

It would be helpful if you can provide a more detailed use case to describe 
exactly what you are seeing so I can try to reproduce it.

> when DRF is used, preemption does not trigger due to incorrect idealAssigned
> ----------------------------------------------------------------------------
>                 Key: YARN-8020
>                 URL: https://issues.apache.org/jira/browse/YARN-8020
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: kyungwan nam
>            Priority: Major
> I’ve met that Inter Queue Preemption does not work.
> It happens when DRF is used and submitting application with a large number of 
> vcores.
> IMHO, idealAssigned can be set incorrectly by following code.
> {code}
> // This function "accepts" all the resources it can (pending) and return
> // the unused ones
> Resource offer(Resource avail, ResourceCalculator rc,
>     Resource clusterResource, boolean considersReservedResource) {
>   Resource absMaxCapIdealAssignedDelta = Resources.componentwiseMax(
>       Resources.subtract(getMax(), idealAssigned),
>       Resource.newInstance(0, 0));
>   // accepted = min{avail,
>   //               max - assigned,
>   //               current + pending - assigned,
>   //               # Make sure a queue will not get more than max of its
>   //               # used/guaranteed, this is to make sure preemption won't
>   //               # happen if all active queues are beyond their guaranteed
>   //               # This is for leaf queue only.
>   //               max(guaranteed, used) - assigned}
>   // remain = avail - accepted
>   Resource accepted = Resources.min(rc, clusterResource,
>       absMaxCapIdealAssignedDelta,
>       Resources.min(rc, clusterResource, avail, Resources
>           /*
>            * When we're using FifoPreemptionSelector (considerReservedResource
>            * = false).
>            *
>            * We should deduct reserved resource from pending to avoid 
> excessive
>            * preemption:
>            *
>            * For example, if an under-utilized queue has used = reserved = 20.
>            * Preemption policy will try to preempt 20 containers (which is not
>            * satisfied) from different hosts.
>            *
>            * In FifoPreemptionSelector, there's no guarantee that preempted
>            * resource can be used by pending request, so policy will preempt
>            * resources repeatly.
>            */
>           .subtract(Resources.add(getUsed(),
>               (considersReservedResource ? pending : pendingDeductReserved)),
>               idealAssigned)));
> {code}
> let’s say,
> * cluster resource : <Memory:200GB, VCores:20>
> * idealAssigned(assigned): <Memory:100GB, VCores:10>
> * avail: <Memory:181GB, Vcores:1>
> * current: <Memory:19GB, Vcores:19>
> * pending: <Memory:0, Vcores:0>
> current + pending - assigned: <Memory:-181GB, Vcores:9>
> min ( avail, (current + pending - assigned) ) : <Memory:-181GB, Vcores:9>
> accepted: <Memory:-181GB, Vcores:9>
> as a result, idealAssigned will be <Memory:-81GB, VCores:19>, which does not 
> trigger preemption.

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to