[ 
https://issues.apache.org/jira/browse/YARN-5994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Eric Badger updated YARN-5994:
------------------------------
    Attachment: YARN-5994.001.patch

Turns out the problem was on the scheduler side. {{rm.submitApp()}} was being 
called before the max capacity on the scheduler was being updated. So in most 
normal cases (i.e. not a slow machine) the max would still be set to the 
default (8092) when it checked to see if the request was over capacity. In the 
case of the main thread getting slowed for whatever reason, the max capacity 
would update to 2048, which is what we set in {{rm.registerNode()}} early in 
the test. Then when {{rm.submitApp()}} was called it would see that the request 
of 3072 > 2048 and fail. Putting a {{Thread.sleep(10000)}} anywhere before 
{{rm.submitApp()}} causes the test to fail in this way.

Putting up a patch that increases the node memory to 4GB and puts in a 
{{waitFor()}} to wait for the max capacity to be updated. 

> TestCapacityScheduler.testAMLimitUsage fails intermittently
> -----------------------------------------------------------
>
>                 Key: YARN-5994
>                 URL: https://issues.apache.org/jira/browse/YARN-5994
>             Project: Hadoop YARN
>          Issue Type: Bug
>            Reporter: Eric Badger
>            Assignee: Eric Badger
>         Attachments: 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacityScheduler-output.txt,
>  YARN-5994.001.patch
>
>
> {noformat}
> java.lang.AssertionError: app shouldn't be null
>       at org.junit.Assert.fail(Assert.java:88)
>       at org.junit.Assert.assertTrue(Assert.java:41)
>       at org.junit.Assert.assertNotNull(Assert.java:621)
>       at 
> org.apache.hadoop.yarn.server.resourcemanager.MockRM.waitForState(MockRM.java:169)
>       at 
> org.apache.hadoop.yarn.server.resourcemanager.MockRM.submitApp(MockRM.java:577)
>       at 
> org.apache.hadoop.yarn.server.resourcemanager.MockRM.submitApp(MockRM.java:488)
>       at 
> org.apache.hadoop.yarn.server.resourcemanager.MockRM.submitApp(MockRM.java:395)
>       at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacityScheduler.verifyAMLimitForLeafQueue(TestCapacityScheduler.java:3389)
>       at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacityScheduler.testAMLimitUsage(TestCapacityScheduler.java:3251)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to