[
https://issues.apache.org/jira/browse/YARN-4989?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15253675#comment-15253675
]
Rohith Sharma K S commented on YARN-4989:
-----------------------------------------
In test {{estWorkPreservingRMRestart#testCapacitySchedulerRecovery}}, after RM
recovered and NM's are registered apps are being wait to recover the
containers. In test code, there are 3 apps runing before RM restart. After RM
restart, {{waitForNumContainersToRecover}} method is called only for 2 apps.
{code}
// Wait for RM to settle down on recovering containers;
waitForNumContainersToRecover(2, rm2, am1_1.getApplicationAttemptId());
waitForNumContainersToRecover(2, rm2, am1_2.getApplicationAttemptId());
waitForNumContainersToRecover(2, rm2, am1_2.getApplicationAttemptId());
{code}
In the above code, third {{waitForNumContainersToRecover}} should be for third
app instead of 2nd apps which is duplicated.
> TestWorkPreservingRMRestart#testCapacitySchedulerRecovery fails
> intermittently
> -------------------------------------------------------------------------------
>
> Key: YARN-4989
> URL: https://issues.apache.org/jira/browse/YARN-4989
> Project: Hadoop YARN
> Issue Type: Bug
> Components: test
> Reporter: Rohith Sharma K S
>
> Sometimes TestWorkPreservingRMRestart#testCapacitySchedulerRecovery fails
> randomly.
> {noformat}
> java.lang.AssertionError: expected:<<memory:2048, vCores:2>> but
> was:<<memory:1024, vCores:1>>
> at org.junit.Assert.fail(Assert.java:88)
> at org.junit.Assert.failNotEquals(Assert.java:743)
> at org.junit.Assert.assertEquals(Assert.java:118)
> at org.junit.Assert.assertEquals(Assert.java:144)
> at
> org.apache.hadoop.yarn.server.resourcemanager.TestWorkPreservingRMRestart.checkCSLeafQueue(TestWorkPreservingRMRestart.java:289)
> at
> org.apache.hadoop.yarn.server.resourcemanager.TestWorkPreservingRMRestart.testCapacitySchedulerRecovery(TestWorkPreservingRMRestart.java:501)
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)