[ 
https://issues.apache.org/jira/browse/YARN-624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13735967#comment-13735967
 ] 

Carlo Curino commented on YARN-624:
-----------------------------------

Related to this is work we just proposed in YARN-1051. We manage dynamically 
negotiated reservation of capacity at admission control. The idea is that if I 
want gang-scheduling I can declare this at submission time and the system 
accept me only if it can "fit" me. At that level we do constraints checking / 
knapsack (e.g., that we never promise more gang-style reservations than we can 
fit). 

This means that at run-time AM-hoarding is ok because we guarantee it to fit. 
I am aware of at least 2 limitations of this approach w.r.t. the dynamic 
version you were discussing:
* doesn't work if the application doesn't know about its needs until the AM has 
started 
* we lose large chunks of the cluster (and our previously checked constraints 
don't hold)

Neither seems a great concern, and the second one can be handle with 
re-planning in the admission-control (which we don't have yet, but its in our 
agenda).

                
> Support gang scheduling in the AM RM protocol
> ---------------------------------------------
>
>                 Key: YARN-624
>                 URL: https://issues.apache.org/jira/browse/YARN-624
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: api, scheduler
>    Affects Versions: 2.0.4-alpha
>            Reporter: Sandy Ryza
>            Assignee: Sandy Ryza
>
> Per discussion on YARN-392 and elsewhere, gang scheduling, in which a 
> scheduler runs a set of tasks when they can all be run at the same time, 
> would be a useful feature for YARN schedulers to support.
> Currently, AMs can approximate this by holding on to containers until they 
> get all the ones they need.  However, this lends itself to deadlocks when 
> different AMs are waiting on the same containers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to