[jira] [Commented] (YARN-110) AM releases too many containers due to the protocol

Arun Suresh (JIRA) Tue, 15 Dec 2015 02:54:30 -0800

    [ 
https://issues.apache.org/jira/browse/YARN-110?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15057843#comment-15057843
 ]


Arun Suresh commented on YARN-110:
----------------------------------

Correct me if I am wrong [~giovanni.fumarola]
Assume the situation mentioned in the Description.
Currently, as per MAPREDUCE-4671, once an AM has received all its containers, 
the MR AM will just send 0 container count ask to cancel any outstanding 
requests. That implies that the RM will have to wait for the next allocate call 
from the AM to free up any erroneously granted resources (7 containers in the 
example) which means that any other AM running on a fully utilized cluster will 
have to wait for an allocate heartbeat, or two, to satisfy its resource ask.

> AM releases too many containers due to the protocol
> ---------------------------------------------------
>
>                 Key: YARN-110
>                 URL: https://issues.apache.org/jira/browse/YARN-110
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: resourcemanager, scheduler
>            Reporter: Arun C Murthy
>            Assignee: Arun C Murthy
>         Attachments: YARN-110.patch
>
>
> - AM sends request asking 4 containers on host H1.
> - Asynchronously, host H1 reaches RM and gets assigned 4 containers. RM at 
> this point, sets the value against H1 to
> zero in its aggregate request-table for all apps.
> - In the mean-while AM gets to need 3 more containers, so a total of 7 
> including the 4 from previous request.
> - Today, AM sends the absolute number of 7 against H1 to RM as part of its 
> request table.
> - RM seems to be overriding its earlier value of zero against H1 to 7 against 
> H1. And thus allocating 7 more
> containers.
> - AM already gets 4 in this scheduling iteration, but gets 7 more, a total of 
> 11 instead of the required 7.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

[jira] [Commented] (YARN-110) AM releases too many containers due to the protocol

Reply via email to