[
https://issues.apache.org/jira/browse/YARN-110?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Arun C Murthy updated YARN-110:
-------------------------------
Attachment: YARN-110.patch
Simple patch to resolve the difference b/w the AM's view of the world with the
RM within the 'transaction' i.e. the allocate call.
The essential idea is for the RM to account for the newly allocated containers
since the last AM heartbeat while updating #containers for * (ANY).
> AM releases too many containers due to the protocol
> ---------------------------------------------------
>
> Key: YARN-110
> URL: https://issues.apache.org/jira/browse/YARN-110
> Project: Hadoop YARN
> Issue Type: Bug
> Components: resourcemanager, scheduler
> Reporter: Arun C Murthy
> Assignee: Arun C Murthy
> Attachments: YARN-110.patch
>
>
> - AM sends request asking 4 containers on host H1.
> - Asynchronously, host H1 reaches RM and gets assigned 4 containers. RM at
> this point, sets the value against H1 to
> zero in its aggregate request-table for all apps.
> - In the mean-while AM gets to need 3 more containers, so a total of 7
> including the 4 from previous request.
> - Today, AM sends the absolute number of 7 against H1 to RM as part of its
> request table.
> - RM seems to be overriding its earlier value of zero against H1 to 7 against
> H1. And thus allocating 7 more
> containers.
> - AM already gets 4 in this scheduling iteration, but gets 7 more, a total of
> 11 instead of the required 7.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira