[
https://issues.apache.org/jira/browse/YARN-5342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15382702#comment-15382702
]
Naganarasimha G R commented on YARN-5342:
-----------------------------------------
[~wangda] & [~sunilg],
Well i agree that YARN-4425 is too much of changes and i am not foreseeing
that as immediate solution for 2.8, but i am not completely convinced to check
for Allocated Node's Partition's pending resource.
bq. but could also lead to shareable resource allocation cut in line while
more specific resource request are pending.
IMHO it would be ideal to check whether pending resources are existing in a
given partition before allocating a container in NonExclusive mode rather than
after container allocation during resetting the counter for a app. Because in
next NonExclusive mode allocation for the node of this parition might skip this
application for which reset happened but might allocate to another application
but still that partition might have pending resource requests.
So if you want to acheive the same i would suggest that have a check for
pending resources before allocation in NonExclusive mode and not to reset this
counter(in app) for containers allocated in NonExclusive mode.
Thoughts?
> Improve non-exclusive node partition resource allocation in Capacity Scheduler
> ------------------------------------------------------------------------------
>
> Key: YARN-5342
> URL: https://issues.apache.org/jira/browse/YARN-5342
> Project: Hadoop YARN
> Issue Type: Sub-task
> Reporter: Wangda Tan
> Assignee: Sunil G
> Attachments: YARN-5342.1.patch, YARN-5342.2.patch
>
>
> In the previous implementation, one non-exclusive container allocation is
> possible when the missed-opportunity >= #cluster-nodes. And
> missed-opportunity will be reset when container allocated to any node.
> This will slow down the frequency of container allocation on non-exclusive
> node partition: *When a non-exclusive partition=x has idle resource, we can
> only allocate one container for this app in every
> X=nodemanagers.heartbeat-interval secs for the whole cluster.*
> In this JIRA, I propose a fix to reset missed-opporunity only if we have >0
> pending resource for the non-exclusive partition OR we get allocation from
> the default partition.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]