[
https://issues.apache.org/jira/browse/YARN-5994?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Eric Badger updated YARN-5994:
------------------------------
Attachment: YARN-5994.001.patch
Turns out the problem was on the scheduler side. {{rm.submitApp()}} was being
called before the max capacity on the scheduler was being updated. So in most
normal cases (i.e. not a slow machine) the max would still be set to the
default (8092) when it checked to see if the request was over capacity. In the
case of the main thread getting slowed for whatever reason, the max capacity
would update to 2048, which is what we set in {{rm.registerNode()}} early in
the test. Then when {{rm.submitApp()}} was called it would see that the request
of 3072 > 2048 and fail. Putting a {{Thread.sleep(10000)}} anywhere before
{{rm.submitApp()}} causes the test to fail in this way.
Putting up a patch that increases the node memory to 4GB and puts in a
{{waitFor()}} to wait for the max capacity to be updated.
> TestCapacityScheduler.testAMLimitUsage fails intermittently
> -----------------------------------------------------------
>
> Key: YARN-5994
> URL: https://issues.apache.org/jira/browse/YARN-5994
> Project: Hadoop YARN
> Issue Type: Bug
> Reporter: Eric Badger
> Assignee: Eric Badger
> Attachments:
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacityScheduler-output.txt,
> YARN-5994.001.patch
>
>
> {noformat}
> java.lang.AssertionError: app shouldn't be null
> at org.junit.Assert.fail(Assert.java:88)
> at org.junit.Assert.assertTrue(Assert.java:41)
> at org.junit.Assert.assertNotNull(Assert.java:621)
> at
> org.apache.hadoop.yarn.server.resourcemanager.MockRM.waitForState(MockRM.java:169)
> at
> org.apache.hadoop.yarn.server.resourcemanager.MockRM.submitApp(MockRM.java:577)
> at
> org.apache.hadoop.yarn.server.resourcemanager.MockRM.submitApp(MockRM.java:488)
> at
> org.apache.hadoop.yarn.server.resourcemanager.MockRM.submitApp(MockRM.java:395)
> at
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacityScheduler.verifyAMLimitForLeafQueue(TestCapacityScheduler.java:3389)
> at
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.capacity.TestCapacityScheduler.testAMLimitUsage(TestCapacityScheduler.java:3251)
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]