[
https://issues.apache.org/jira/browse/YARN-5342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15377092#comment-15377092
]
Sunil G commented on YARN-5342:
-------------------------------
I feel the current approach seems simple and transparent to solve the container
allocation for non-exclusive labels.
Few alternatives could have been like:
1. We could still allocate no_label containers to a non-exclusive partition (OR
do not reset the scheduling opportunity) provided
- non-exclusive partition pending + no_label container resource demand(for 1
request) < total free resources (available) in non-exclusive.
- It is possible that complete *pending* resource for a non-exclusive partition
may not be able to allocate due to user-limit/factor, am-resource-percentage
etc. So if we can get effective pending value, we could add to the equation and
can do more allocation in non-exclusive partition.
2. Another idea is to do over committing when specific per-partition demand is
coming for non-exclusive partition. And do a preemption if needed for other
container. This is of very aggressive nature, So I am not feeling it ll be
acceptable.
But these are not very transparent or easier to explain to user as a whitebox
operation. So we could discuss and continue this in a new ticket provided
current patch goes in. I could raise another ticket as an improvement task.
Thoughts [~leftnoteasy] / [~Naganarasimha Garla].
> Improve non-exclusive node partition resource allocation in Capacity Scheduler
> ------------------------------------------------------------------------------
>
> Key: YARN-5342
> URL: https://issues.apache.org/jira/browse/YARN-5342
> Project: Hadoop YARN
> Issue Type: Sub-task
> Reporter: Wangda Tan
> Assignee: Wangda Tan
> Attachments: YARN-5342.1.patch
>
>
> In the previous implementation, one non-exclusive container allocation is
> possible when the missed-opportunity >= #cluster-nodes. And
> missed-opportunity will be reset when container allocated to any node.
> This will slow down the frequency of container allocation on non-exclusive
> node partition: *When a non-exclusive partition=x has idle resource, we can
> only allocate one container for this app in every
> X=nodemanagers.heartbeat-interval secs for the whole cluster.*
> In this JIRA, I propose a fix to reset missed-opporunity only if we have >0
> pending resource for the non-exclusive partition OR we get allocation from
> the default partition.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]