[ 
https://issues.apache.org/jira/browse/YARN-3020?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14284301#comment-14284301
 ] 

Peter D Kirchner commented on YARN-3020:
----------------------------------------

https://issues.apache.org/jira/secure/ViewProfile.jspa?name=ywskycn : Please 
take a look at this snippet modifying distributedShell, and the output, and 
perhaps you will get my point.  Observe that the accounting behind what gets 
sent to the RM on heartbeats following either addContainerRequest() or 
removeContainerRequest() is defective.  100 containers are assigned as the 
result of this code that ostensibly requests only 10:  10 adds, with 
interleaved heartbeats, followed by 10 removes that with interleaved heartbeats 
should be no-ops.  55 containers result from the adds (1+2+3+4+5+6+7+8+9+10). 
45 additional containers are requested as the result of the 10 calls to remove 
(9+8+7+6+5+4+3+2+1). 

for (int i=0; i<20; i++){
 try {
  ContainerRequest containerAsk = setupContainerAskForRM();

    if(i<10) {
        amRMClient.addContainerRequest(containerAsk);
    } else {
        amRMClient.removeContainerRequest(containerAsk);
    }
    Thread.sleep(1500);
    List list1 = amRMClient.getMatchingRequests(containerAsk.getPriority(), 
"*", containerAsk.getCapability());
    LinkedHashSet set1 = (java.util.LinkedHashSet)(list1.get(0));   
    System.out.println("i="+i+" outstanding="+set1.size());

 } catch (InterruptedException e1) {
    e1.printStackTrace();
 }
}

DEBUG [Thread-7] (AMRMClientImpl.java:585) - addResourceRequest: applicationId= 
priority=0 resourceName=* numContainers=1 #asks=1
i=0 outstanding=1
DEBUG [Thread-7] (AMRMClientImpl.java:585) - addResourceRequest: applicationId= 
priority=0 resourceName=* numContainers=2 #asks=1
i=1 outstanding=2
DEBUG [Thread-7] (AMRMClientImpl.java:585) - addResourceRequest: applicationId= 
priority=0 resourceName=* numContainers=3 #asks=1
i=2 outstanding=3
DEBUG [Thread-7] (AMRMClientImpl.java:585) - addResourceRequest: applicationId= 
priority=0 resourceName=* numContainers=4 #asks=1
i=3 outstanding=4
DEBUG [Thread-7] (AMRMClientImpl.java:585) - addResourceRequest: applicationId= 
priority=0 resourceName=* numContainers=5 #asks=1
i=4 outstanding=5
DEBUG [Thread-7] (AMRMClientImpl.java:585) - addResourceRequest: applicationId= 
priority=0 resourceName=* numContainers=6 #asks=1
i=5 outstanding=6
DEBUG [Thread-7] (AMRMClientImpl.java:585) - addResourceRequest: applicationId= 
priority=0 resourceName=* numContainers=7 #asks=1
i=6 outstanding=7
DEBUG [Thread-7] (AMRMClientImpl.java:585) - addResourceRequest: applicationId= 
priority=0 resourceName=* numContainers=8 #asks=1
i=7 outstanding=8
DEBUG [Thread-7] (AMRMClientImpl.java:585) - addResourceRequest: applicationId= 
priority=0 resourceName=* numContainers=9 #asks=1
i=8 outstanding=9
DEBUG [Thread-7] (AMRMClientImpl.java:585) - addResourceRequest: applicationId= 
priority=0 resourceName=* numContainers=10 #asks=1
i=9 outstanding=10
DEBUG [Thread-7] (AMRMClientImpl.java:619) - BEFORE decResourceRequest: 
applicationId= priority=0 resourceName=* numContainers=10 #asks=0
 INFO [Thread-7] (AMRMClientImpl.java:652) - AFTER decResourceRequest: 
applicationId= priority=0 resourceName=* numContainers=9 #asks=1
i=10 outstanding=10
DEBUG [Thread-7] (AMRMClientImpl.java:619) - BEFORE decResourceRequest: 
applicationId= priority=0 resourceName=* numContainers=9 #asks=0
 INFO [Thread-7] (AMRMClientImpl.java:652) - AFTER decResourceRequest: 
applicationId= priority=0 resourceName=* numContainers=8 #asks=1
i=11 outstanding=10
DEBUG [Thread-7] (AMRMClientImpl.java:619) - BEFORE decResourceRequest: 
applicationId= priority=0 resourceName=* numContainers=8 #asks=0
 INFO [Thread-7] (AMRMClientImpl.java:652) - AFTER decResourceRequest: 
applicationId= priority=0 resourceName=* numContainers=7 #asks=1
i=12 outstanding=10
DEBUG [Thread-7] (AMRMClientImpl.java:619) - BEFORE decResourceRequest: 
applicationId= priority=0 resourceName=* numContainers=7 #asks=0
 INFO [Thread-7] (AMRMClientImpl.java:652) - AFTER decResourceRequest: 
applicationId= priority=0 resourceName=* numContainers=6 #asks=1
i=13 outstanding=10
DEBUG [Thread-7] (AMRMClientImpl.java:619) - BEFORE decResourceRequest: 
applicationId= priority=0 resourceName=* numContainers=6 #asks=0
 INFO [Thread-7] (AMRMClientImpl.java:652) - AFTER decResourceRequest: 
applicationId= priority=0 resourceName=* numContainers=5 #asks=1
i=14 outstanding=10
DEBUG [Thread-7] (AMRMClientImpl.java:619) - BEFORE decResourceRequest: 
applicationId= priority=0 resourceName=* numContainers=5 #asks=0
 INFO [Thread-7] (AMRMClientImpl.java:652) - AFTER decResourceRequest: 
applicationId= priority=0 resourceName=* numContainers=4 #asks=1
i=15 outstanding=10
DEBUG [Thread-7] (AMRMClientImpl.java:619) - BEFORE decResourceRequest: 
applicationId= priority=0 resourceName=* numContainers=4 #asks=0
 INFO [Thread-7] (AMRMClientImpl.java:652) - AFTER decResourceRequest: 
applicationId= priority=0 resourceName=* numContainers=3 #asks=1
i=16 outstanding=10
DEBUG [Thread-7] (AMRMClientImpl.java:619) - BEFORE decResourceRequest: 
applicationId= priority=0 resourceName=* numContainers=3 #asks=0
 INFO [Thread-7] (AMRMClientImpl.java:652) - AFTER decResourceRequest: 
applicationId= priority=0 resourceName=* numContainers=2 #asks=1
i=17 outstanding=10
DEBUG [Thread-7] (AMRMClientImpl.java:619) - BEFORE decResourceRequest: 
applicationId= priority=0 resourceName=* numContainers=2 #asks=0
 INFO [Thread-7] (AMRMClientImpl.java:652) - AFTER decResourceRequest: 
applicationId= priority=0 resourceName=* numContainers=1 #asks=1
i=18 outstanding=10
DEBUG [Thread-7] (AMRMClientImpl.java:619) - BEFORE decResourceRequest: 
applicationId= priority=0 resourceName=* numContainers=1 #asks=0
 INFO [Thread-7] (AMRMClientImpl.java:652) - AFTER decResourceRequest: 
applicationId= priority=0 resourceName=* numContainers=0 #asks=1

[hadoop-2.5.0]$ find logs/userlogs -name stdout | wc -l
100


> n similar addContainerRequest()s produce n*(n+1)/2 containers
> -------------------------------------------------------------
>
>                 Key: YARN-3020
>                 URL: https://issues.apache.org/jira/browse/YARN-3020
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: client
>    Affects Versions: 2.5.0, 2.6.0, 2.5.1, 2.5.2
>            Reporter: Peter D Kirchner
>   Original Estimate: 24h
>  Remaining Estimate: 24h
>
> BUG: If the application master calls addContainerRequest() n times, but with 
> the same priority, I get up to 1+2+3+...+n containers = n*(n+1)/2 .  The most 
> containers are requested when the interval between calls to 
> addContainerRequest() exceeds the heartbeat interval of calls to allocate() 
> (in AMRMClientImpl's run() method).
> If the application master calls addContainerRequest() n times, but with a 
> unique priority each time, I get n containers (as I intended).
> Analysis:
> There is a logic problem in AMRMClientImpl.java.
> Although AMRMClientImpl.java, allocate() does an ask.clear() , on subsequent 
> calls to addContainerRequest(), addResourceRequest() finds the previous 
> matching remoteRequest and increments the container count rather than 
> starting anew, and does an addResourceRequestToAsk() which defeats the 
> ask.clear().
> From documentation and code comments, it was hard for me to discern the 
> intended behavior of the API, but the inconsistency reported in this issue 
> suggests one case or the other is implemented incorrectly.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to