[
https://issues.apache.org/jira/browse/YARN-8020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16397262#comment-16397262
]
Eric Payne commented on YARN-8020:
----------------------------------
[~kyungwan nam], on what version of YARN are you seeing this problem? My
experience with DRF is different than is described above. I have investigated
this on both 2.8 and 3.2 snapshot builds.
We are using the DRF calculator in large preemptable queues with various sizes
of containers using both large memory or large vcores or both. Cross-queue
preemption seems to be working well in general. I do see a corner case, but
first I want to address your above comments.
bq. as a result, idealAssigned will be <Memory:-81GB, VCores:19>, which does
not trigger preemption.
If one of the elements in the idealAssigned Resource is 0 or less than 0,
preemption will not occur. This is so that preemption won't bring the queue too
far below its guarantee for one of the elements. Having said that, it will
preempt to a large extent even if it brings one of the elements below its
guarantee, but if one of them goes to 0 or below in the idealAssigned Resource,
it will stop preempting.
bq. avail: <Memory:181GB, Vcores:1>
Cross-queue preemption will not preempt if there are available resources in the
cluster or queue. It depends on how many resources are being requested by the
other queue, but even with 1 available vcore, preemption may choose not to
preempt in this case as well.
Now on to my corner case.
I do not see a problem using DRF if the containers in the preemptable queue
have a larger Resource element and the containers in the asking queue have
smaller Resource elements. For example, it seems to work fine if Resources in
the preemptable queue is using <Memory:1GB, VCores:10> containers and the
asking queue is using smaller containers, for example <Memory:1GB, Vcores:3>
containers.
The place where it seems to get stuck is when the containers in the preemptable
queue are using one or more smaller Resource elements than the containers in
the asking queue. For example, it will sometimes not preempt if the preemptable
queue has containers using <Memory:0.5GB, VCores:1> and the asking queue queue
has containers using <Memory:0.5GB, VCores:2>.
Even in the latter case, preemption will sometimes still occur, depending on
the ratio of the sizes of each element to the ones in the ohter queue.
It would be helpful if you can provide a more detailed use case to describe
exactly what you are seeing so I can try to reproduce it.
> when DRF is used, preemption does not trigger due to incorrect idealAssigned
> ----------------------------------------------------------------------------
>
> Key: YARN-8020
> URL: https://issues.apache.org/jira/browse/YARN-8020
> Project: Hadoop YARN
> Issue Type: Bug
> Reporter: kyungwan nam
> Priority: Major
>
> I’ve met that Inter Queue Preemption does not work.
> It happens when DRF is used and submitting application with a large number of
> vcores.
> IMHO, idealAssigned can be set incorrectly by following code.
> {code}
> // This function "accepts" all the resources it can (pending) and return
> // the unused ones
> Resource offer(Resource avail, ResourceCalculator rc,
> Resource clusterResource, boolean considersReservedResource) {
> Resource absMaxCapIdealAssignedDelta = Resources.componentwiseMax(
> Resources.subtract(getMax(), idealAssigned),
> Resource.newInstance(0, 0));
> // accepted = min{avail,
> // max - assigned,
> // current + pending - assigned,
> // # Make sure a queue will not get more than max of its
> // # used/guaranteed, this is to make sure preemption won't
> // # happen if all active queues are beyond their guaranteed
> // # This is for leaf queue only.
> // max(guaranteed, used) - assigned}
> // remain = avail - accepted
> Resource accepted = Resources.min(rc, clusterResource,
> absMaxCapIdealAssignedDelta,
> Resources.min(rc, clusterResource, avail, Resources
> /*
> * When we're using FifoPreemptionSelector (considerReservedResource
> * = false).
> *
> * We should deduct reserved resource from pending to avoid
> excessive
> * preemption:
> *
> * For example, if an under-utilized queue has used = reserved = 20.
> * Preemption policy will try to preempt 20 containers (which is not
> * satisfied) from different hosts.
> *
> * In FifoPreemptionSelector, there's no guarantee that preempted
> * resource can be used by pending request, so policy will preempt
> * resources repeatly.
> */
> .subtract(Resources.add(getUsed(),
> (considersReservedResource ? pending : pendingDeductReserved)),
> idealAssigned)));
> {code}
> let’s say,
> * cluster resource : <Memory:200GB, VCores:20>
> * idealAssigned(assigned): <Memory:100GB, VCores:10>
> * avail: <Memory:181GB, Vcores:1>
> * current: <Memory:19GB, Vcores:19>
> * pending: <Memory:0, Vcores:0>
> current + pending - assigned: <Memory:-181GB, Vcores:9>
> min ( avail, (current + pending - assigned) ) : <Memory:-181GB, Vcores:9>
> accepted: <Memory:-181GB, Vcores:9>
> as a result, idealAssigned will be <Memory:-81GB, VCores:19>, which does not
> trigger preemption.
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]