[ 
https://issues.apache.org/jira/browse/YARN-624?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13658637#comment-13658637
 ] 

Alejandro Abdelnur commented on YARN-624:
-----------------------------------------

As pointed out, supporting gang at RM/scheduler level will allow 
detection/avoidance of deadlocks. This would not be trivial (nor efficient) to 
do if gang is done at AM level.

Examples of gang request capabilities could be:

* express a set of containers in any nodes. I.e.: 10 containers in any node of 
the cluster.
* express a set of containers in a specified set of nodes. I.e.: 10 containers 
in rack1. 10 containers one in each of n1...n10
* express different sets of possible gangs that would satisfy the request: 
I.e.: 10 containers in rack1 or in rack2. 10 containers in n1...n10 or in 
n11..n20.
* indicate a timeout/fallback-to-normal of gang requests.

We should decide on what gang capabilities we want/need to address in the short 
term.

                
> Support gang scheduling in the AM RM protocol
> ---------------------------------------------
>
>                 Key: YARN-624
>                 URL: https://issues.apache.org/jira/browse/YARN-624
>             Project: Hadoop YARN
>          Issue Type: Sub-task
>          Components: api, scheduler
>    Affects Versions: 2.0.4-alpha
>            Reporter: Sandy Ryza
>            Assignee: Sandy Ryza
>
> Per discussion on YARN-392 and elsewhere, gang scheduling, in which a 
> scheduler runs a set of tasks when they can all be run at the same time, 
> would be a useful feature for YARN schedulers to support.
> Currently, AMs can approximate this by holding on to containers until they 
> get all the ones they need.  However, this lends itself to deadlocks when 
> different AMs are waiting on the same containers.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to