[
https://issues.apache.org/jira/browse/YARN-3020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14284451#comment-14284451
]
Wangda Tan commented on YARN-3020:
----------------------------------
[~peterdkirchner],
The expected usage of AMRMClient is (Thanks for input from [~hitesh] and
[~jianhe]): When you received newly allocated containers from RM, you should
manually call {{removeContainerRequest}} to remove pending container requests.
AMRMClient itself will not automatically deduct #pendingContainerRequests.
The reason is, when a container allocated from RM, AMRMClient doesn't know the
container allocated from which ResourceRequest. You may think container has
priority, capacity and resourceName, so that AMRMClient can get ResourceRequest
via {{getMatchingRequests}}. But it is possible some applications may use the
container for other propose (AMRMClient cannot understand application's
specific logic). So AM should call {{removeContainerRequest}} itself.
To improve this, I think 1) we need add this behavior to YARN doc -- people
should better understand how to use AMRMClient. And 2) maybe we should add a
default implementation to deduct pending resource requests by
prioirty/resource-name/capacity of allocated containers automatically (User can
disable this default behavior, implement their own logic to deduct pending
resource requests.)
Does this make sense to you?
Thanks,
Wangda
> n similar addContainerRequest()s produce n*(n+1)/2 containers
> -------------------------------------------------------------
>
> Key: YARN-3020
> URL: https://issues.apache.org/jira/browse/YARN-3020
> Project: Hadoop YARN
> Issue Type: Bug
> Components: client
> Affects Versions: 2.5.0, 2.6.0, 2.5.1, 2.5.2
> Reporter: Peter D Kirchner
> Original Estimate: 24h
> Remaining Estimate: 24h
>
> BUG: If the application master calls addContainerRequest() n times, but with
> the same priority, I get up to 1+2+3+...+n containers = n*(n+1)/2 . The most
> containers are requested when the interval between calls to
> addContainerRequest() exceeds the heartbeat interval of calls to allocate()
> (in AMRMClientImpl's run() method).
> If the application master calls addContainerRequest() n times, but with a
> unique priority each time, I get n containers (as I intended).
> Analysis:
> There is a logic problem in AMRMClientImpl.java.
> Although AMRMClientImpl.java, allocate() does an ask.clear() , on subsequent
> calls to addContainerRequest(), addResourceRequest() finds the previous
> matching remoteRequest and increments the container count rather than
> starting anew, and does an addResourceRequestToAsk() which defeats the
> ask.clear().
> From documentation and code comments, it was hard for me to discern the
> intended behavior of the API, but the inconsistency reported in this issue
> suggests one case or the other is implemented incorrectly.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)