Wangda Tan created YARN-3243:
--------------------------------
Summary: CapacityScheduler should pass headroom from parent to
children to make sure ParentQueue obey its capacity limits.
Key: YARN-3243
URL: https://issues.apache.org/jira/browse/YARN-3243
Project: Hadoop YARN
Issue Type: Bug
Components: capacityscheduler, resourcemanager
Reporter: Wangda Tan
Assignee: Wangda Tan
Now CapacityScheduler has some issues to make sure ParentQueue always obeys its
capacity limits, for example:
1) When allocating container of a parent queue, it will only check
parentQueue.usage < parentQueue.max. If leaf queue allocated a container.size >
(parentQueue.max - parentQueue.usage), parent queue can excess its max resource
limit, as following example:
{code}
A (usage=54, max=55)
/ \
A1 A2 (usage=1, max=55)
(usage=53, max=53)
{code}
Queue-A2 is able to allocate container since its usage < max, but if we do
that, A's usage can excess A.max.
2) When doing continous reservation check, parent queue will only tell children
"you need unreserve *some* resource, so that I will less than my maximum
resource", but it will not tell how many resource need to be unreserved. This
may lead to parent queue excesses configured maximum capacity as well.
With YARN-3099/YARN-3124, now we have {{ResourceUsage}} class in each class,
*here is my proposal*:
- ParentQueue will set its children's ResourceUsage.headroom, which means,
*maximum resource its children can allocate*.
- ParentQueue will set its children's headroom to be (saying parent's name is
"qA"): min(qA.headroom, qA.max - qA.used). This will make sure qA's ancestors'
capacity will be enforced as well (qA.headroom is set by qA's parent).
- {{needToUnReserve}} is not necessary, instead, children can get how much
resource need to be unreserved to keep its parent's resource limit.
- More over, with this, YARN-3026 will make a clear boundary between LeafQueue
and FiCaSchedulerApp, headroom will consider user-limit, etc.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)