[ 
https://issues.apache.org/jira/browse/YARN-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13638783#comment-13638783
 ] 

Carlo Curino commented on YARN-45:
----------------------------------

Updated the protocol patch (and the implementation for capacity scheduler), to 
reflect the discussion in the above comments (plus various offline 
conversations).  The current proposal is a minimal protocol change and compact 
policies, which capture the portion of our initial proposal on which we reached 
reasonable consensus. 

The key change is the following:
# Simplified the protocol modification to include only Set<ContainerID> as 
vehicle to express preemption requests.
# Modified ProportionalCapacityPreemptionPolicy to select containers by 
reversed priority, and within each priority by reversed container id (reflects 
order of allocation). 
# Simplified all the "pipes" in the RM that propagated decisions about 
preemption around (to not-include resource-based preemptions). 

The decision is based on the following rationale: 
There seems to be agreement on the fact that ResourceRequest -based preemption 
is appealing due to: symmetry, compactness, and the flexibility it provides to 
the AM. 
However, the declarative nature of the specification makes the "tracking" over 
time quite tricky. In particular, both RM and AMs must be capable of maintain 
some form of history of the resources being requested: 
# for the RM, consciously preempt containers only for the fraction of resources 
that have been consistently asked to the AM over time (a notion of 
ResourceRequest intersection should be defined),  
# for the AM, to track its own preemption actions, and know when they are 
received by the RM (this is needed to discount the RM requests while the task 
are being check pointed).

With [~chris.douglas] we worked out a possible set of semantics for the above 
and started to work on a version of the ProportionalCapacityPreemptionPolicy 
that reflects those. While they seem reasonable are likely to generate longer 
(speculative) discussions. 

So following the spirit of [~acmurthy]'s last comment and after feedback from 
[~tucu00], [~bikassaha], [~vinodkv], [~sseth], [~hitesh] we propose 
Set<ContainerID> as an initial strategy that will allow us to: 
# observe most of the benefits of preemption, 
# gain experience in running schedulers leveraging preemption. 

                
> Scheduler feedback to AM to release containers
> ----------------------------------------------
>
>                 Key: YARN-45
>                 URL: https://issues.apache.org/jira/browse/YARN-45
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: resourcemanager
>            Reporter: Chris Douglas
>            Assignee: Carlo Curino
>         Attachments: YARN-45.patch, YARN-45.patch
>
>
> The ResourceManager strikes a balance between cluster utilization and strict 
> enforcement of resource invariants in the cluster. Individual allocations of 
> containers must be reclaimed- or reserved- to restore the global invariants 
> when cluster load shifts. In some cases, the ApplicationMaster can respond to 
> fluctuations in resource availability without losing the work already 
> completed by that task (MAPREDUCE-4584). Supplying it with this information 
> would be helpful for overall cluster utilization [1]. To this end, we want to 
> establish a protocol for the RM to ask the AM to release containers.
> [1] http://research.yahoo.com/files/yl-2012-003.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to