[ 
https://issues.apache.org/jira/browse/YARN-5342?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15382622#comment-15382622
 ] 

Wangda Tan commented on YARN-5342:
----------------------------------

[~Naganarasimha], I understand your concerns about our existing proposal still 
aggressively resetting missed opportunity that could lead to slowness of 
allocation or resource under utilization.

I also agree that solving this issue need some global information and 
abstractions such as YARN-4425. I would like to be more conservative and keep 
this change as simple as possible since I don't want it brings other issues 
such as an app asks for a specific partition but cannot get it because of 
shareable resource allocation.

I think your proposal approach could alleviate existing slowness of resource 
allocation, but could also lead to shareable resource allocation cut in line 
while more specific resource request are pending. Unless we can do more 
comprehensive tests, I will not be very confident with more complicated 
approach (such as decrease missed-opportunity counter in some rate instead of 
resetting them).

Are you OK with moving comprehensive fixes to a separate JIRA and keeps simple 
and straightforward changes in this one?

> Improve non-exclusive node partition resource allocation in Capacity Scheduler
> ------------------------------------------------------------------------------
>
>                 Key: YARN-5342
>                 URL: https://issues.apache.org/jira/browse/YARN-5342
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>            Reporter: Wangda Tan
>            Assignee: Sunil G
>         Attachments: YARN-5342.1.patch, YARN-5342.2.patch
>
>
> In the previous implementation, one non-exclusive container allocation is 
> possible when the missed-opportunity >= #cluster-nodes. And 
> missed-opportunity will be reset when container allocated to any node.
> This will slow down the frequency of container allocation on non-exclusive 
> node partition: *When a non-exclusive partition=x has idle resource, we can 
> only allocate one container for this app in every 
> X=nodemanagers.heartbeat-interval secs for the whole cluster.*
> In this JIRA, I propose a fix to reset missed-opporunity only if we have >0 
> pending resource for the non-exclusive partition OR we get allocation from 
> the default partition.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to