[
https://issues.apache.org/jira/browse/YARN-5342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15382251#comment-15382251
]
Naganarasimha G R commented on YARN-5342:
-----------------------------------------
Hi [~sunilg],
As discussed offline i still feel we should not have logic to reset
counter in an app based on the pending resources of the NonExclusive Partition
Node, Assume there are 3 partitions (P1(2 nodes),P2(20), default(30) ), and
allocation happened on P1 in that case we reset the counter if pending resource
is more in P1 but P2 could have allocated to request with Default partition.
hence its not correct to have logic based on P1's node pending resource
Options i would suggest is :
# Get the used resources of Default partition and if its greater than
particular configured percentage or is 100 then do not reset else reset.
# Just subract 1 from the counter for allocation in each of NonExclusive
Partition Node. If we think its not efficient that only 1 gets subracted i.e
only 1 hb everytime then we can subract allocations done for a given time
period say 5 seconds.
Thoughts ?
> Improve non-exclusive node partition resource allocation in Capacity Scheduler
> ------------------------------------------------------------------------------
>
> Key: YARN-5342
> URL: https://issues.apache.org/jira/browse/YARN-5342
> Project: Hadoop YARN
> Issue Type: Sub-task
> Reporter: Wangda Tan
> Assignee: Sunil G
> Attachments: YARN-5342.1.patch, YARN-5342.2.patch
>
>
> In the previous implementation, one non-exclusive container allocation is
> possible when the missed-opportunity >= #cluster-nodes. And
> missed-opportunity will be reset when container allocated to any node.
> This will slow down the frequency of container allocation on non-exclusive
> node partition: *When a non-exclusive partition=x has idle resource, we can
> only allocate one container for this app in every
> X=nodemanagers.heartbeat-interval secs for the whole cluster.*
> In this JIRA, I propose a fix to reset missed-opporunity only if we have >0
> pending resource for the non-exclusive partition OR we get allocation from
> the default partition.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]