[
https://issues.apache.org/jira/browse/YARN-7469?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16247790#comment-16247790
]
Eric Payne commented on YARN-7469:
----------------------------------
When a queue is in the state as described above,
{{FifoIntraQueuePreemptionPlugin#calculateToBePreemptedResourcePerApp}} decides
(erroneously, I believe) that {{app2}} has preemptable resources. Since
{{app2}} is the youngest with apparent resources,
{{FifoIntraQueuePreemptionPlugin#preemptFromLeastStarvedApp}} selects a
container to preempt from {{app2}}. However, when it calls
{{FifoIntraQueuePreemptionPlugin#skipContainerBasedOnIntraQueuePolicy}}, it
decides that preempting the selected container would bring the user limit down
too far, so it skips the container. However, it doesn't go on to the next
youngest app with resources.
The logic breaks down to basically this:
{code}
calculateToBePreemptedResourcePerApp {
// preemtableFromApp will be used to select containers to preempt.
preemtableFromApp = used - (userlimit - AmSize)
}
skipContainerBasedOnIntraQueuePolicy {
if (used - selectedContainerSize) <= (userlimit + AmSize) {
Skip this container
}
}
{code}
We get into this starvation mode when {{selectedContainerSize}} ends up being
the same size as {{preemtableFromApp}}
> Capacity Scheduler Intra-queue preemption: User can starve if newest app is
> exactly at user limit
> -------------------------------------------------------------------------------------------------
>
> Key: YARN-7469
> URL: https://issues.apache.org/jira/browse/YARN-7469
> Project: Hadoop YARN
> Issue Type: Bug
> Components: capacity scheduler, yarn
> Affects Versions: 2.9.0, 3.0.0-beta1, 2.8.2
> Reporter: Eric Payne
> Assignee: Eric Payne
> Attachments: UnitTestToShowStarvedUser.patch
>
>
> Queue Configuration:
> - Total Memory: 20GB
> - 2 Queues
> -- Queue1
> --- Memory: 10GB
> --- MULP: 10%
> --- ULF: 2.0
> - Minimum Container Size: 0.5GB
> Use Case:
> - User1 submits app1 to Queue1 and consumes 20GB
> - User2 submits app2 to Queue1 and requests 7.5GB
> - Preemption monitor preempts 7.5GB from app1. Capacity Scheduler gives those
> resources to User2
> - User 3 submits app3 to Queue1. To begin with, app3 is requesting 1
> container for the AM.
> - Preemption monitor never preempts a container.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]