[ 
https://issues.apache.org/jira/browse/YARN-3020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14271971#comment-14271971
 ] 

Peter D Kirchner commented on YARN-3020:
----------------------------------------

Hold the phone.  Your test in TestAMRMClient.java will ~always succeed but does 
not test for the bug.  Your test counts the *client's* accumulation of resource 
requests, and does not test the *number of times* the requests, in various 
stages of accumulation, gets sent to the server (Resource Manager).  Please 
call addContainerRequest() multiple times across multiple heartbeats, and then 
examine the number of containers that the Resource Manager allocates, and you 
should see the bug.

> n similar addContainerRequest()s produce n*(n+1)/2 containers
> -------------------------------------------------------------
>
>                 Key: YARN-3020
>                 URL: https://issues.apache.org/jira/browse/YARN-3020
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: client
>    Affects Versions: 2.5.0, 2.6.0, 2.5.1, 2.5.2
>            Reporter: Peter D Kirchner
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> BUG: If the application master calls addContainerRequest() n times, but with 
> the same priority, I get up to 1+2+3+...+n containers = n*(n+1)/2 .  The most 
> containers are requested when the interval between calls to 
> addContainerRequest() exceeds the heartbeat interval of calls to allocate() 
> (in AMRMClientImpl's run() method).
> If the application master calls addContainerRequest() n times, but with a 
> unique priority each time, I get n containers (as I intended).
> Analysis:
> There is a logic problem in AMRMClientImpl.java.
> Although AMRMClientImpl.java, allocate() does an ask.clear() , on subsequent 
> calls to addContainerRequest(), addResourceRequest() finds the previous 
> matching remoteRequest and increments the container count rather than 
> starting anew, and does an addResourceRequestToAsk() which defeats the 
> ask.clear().
> From documentation and code comments, it was hard for me to discern the 
> intended behavior of the API, but the inconsistency reported in this issue 
> suggests one case or the other is implemented incorrectly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to