[ 
https://issues.apache.org/jira/browse/YARN-779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13684413#comment-13684413
 ] 

Sandy Ryza commented on YARN-779:
---------------------------------

The MR AM has solved this problem purely on the AM side (can't remember the 
JIRA number, but I'll post it when I find it), so I think it should be possible 
to do this without changing the RM or the AMRM protocol.  The basic issue is 
that, when a container is given to the app, we need to associate it with a 
ContainerRequest so that we can cancel the right resource requests.  In 
general, the AMRMClient cannot automatically perform this association.  
Consider a situation where an app needs two tasks, one on node1 or node2, and 
one on node2 or node3.  When the app receives a container on node2, it will 
assign it to one of these tasks, but only the app knows which task it is 
assigning it to.  So we need some sort of API for the app to communicate this 
knowledge to the AMRMClient.
                
> AMRMClient should clean up dangling unsatisfied request
> -------------------------------------------------------
>
>                 Key: YARN-779
>                 URL: https://issues.apache.org/jira/browse/YARN-779
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: client
>    Affects Versions: 2.0.4-alpha
>            Reporter: Alejandro Abdelnur
>            Assignee: Maysam Yabandeh
>            Priority: Critical
>
> If an AMRMClient allocates a ContainerRequest for 10 containers in node1 or 
> node2 is placed (assuming a single rack) the resulting ResourceRequests will 
> be
> {code}
> location - containers
> ---------------------
> node1    - 10
> node2    - 10
> rack     - 10
> ANY      - 10
> {code}
> Assuming 5 containers are allocated in node1 and 5 containers are allocated 
> in node2, the following ResourceRequests will be outstanding on the RM.
> {code}
> location - containers
> ---------------------
> node1    - 5
> node2    - 5
> {code}
> If the AMMRClient does a new ContainerRequest allocation, this time for 5 
> containers in node3, the resulting outstanding ResourceRequests on the RM 
> will be:
> {code}
> location - containers
> ---------------------
> node1    - 5
> node2    - 5
> node3    - 5
> rack     - 5
> ANY      - 5
> {code}
> At this point, the scheduler may assign 5 containers to node1 and it will 
> never assign the 5 containers node3 asked for.
> AMRMClient should keep track of the outstanding allocations counts per 
> ContainerRequest and when gets to zero it should update the the RACK/ANY 
> decrementing the dangling requests. 

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira

Reply via email to