[ 
https://issues.apache.org/jira/browse/YARN-45?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13628889#comment-13628889
 ] 

Alejandro Abdelnur commented on YARN-45:
----------------------------------------

Carlo, what about a small twist?

A preempt message (instead of request, as there is no preempt response) would 
contain:

* Resources (# CPUs & # Memory) : total amount of resources that may be 
preempted if no action is taken by the AM.
* Set<ContainerID> : list of containers that would be killed by the RM to claim 
the resources if no action is taken by the AM.

Computing the resources is straight forward, just aggregating the resources of 
the Set<ContainerID>.

An AM can take action using either or information.

If an AM releases the requested amount of resources, even if they don't match 
the received container IDs, then the AM will not be over threshold anymore, 
thus getting rid of the preemption pressure fully or partially. If the AM 
fullfils the preemption only partially, then the RM will still kill some 
containers from the set.

As the set is not ordered, still it is not known to the AM what containers will 
exactly be killed. So the set is just the list of containers in danger of being 
preempted.

I may be backtracking a bit on my previous comments, 'trading these containers 
for equivalent ones' seems acceptable and gives the scheduler some freedom on 
how to best take care of things if an AM is over limit. If an AM releases the 
requested amount of resources, regardless of what containers releases, the AM 
won't be preempted for this preemption message. We just need to clearly spell 
out the behavior.

With this approach I think we don't need #1 and #2?

Thoughts?


                
> Scheduler feedback to AM to release containers
> ----------------------------------------------
>
>                 Key: YARN-45
>                 URL: https://issues.apache.org/jira/browse/YARN-45
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: resourcemanager
>            Reporter: Chris Douglas
>            Assignee: Carlo Curino
>         Attachments: YARN-45.patch
>
>
> The ResourceManager strikes a balance between cluster utilization and strict 
> enforcement of resource invariants in the cluster. Individual allocations of 
> containers must be reclaimed- or reserved- to restore the global invariants 
> when cluster load shifts. In some cases, the ApplicationMaster can respond to 
> fluctuations in resource availability without losing the work already 
> completed by that task (MAPREDUCE-4584). Supplying it with this information 
> would be helpful for overall cluster utilization [1]. To this end, we want to 
> establish a protocol for the RM to ask the AM to release containers.
> [1] http://research.yahoo.com/files/yl-2012-003.pdf

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to