[
https://issues.apache.org/jira/browse/YARN-3790?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14580255#comment-14580255
]
Rohith commented on YARN-3790:
------------------------------
Thanks for looking into this issue,
bq. If UpdateThread call update after recoverContainersOnNode, the test will
succeed
In the test, I see below code which verify for contaner to recover right?
{code}
// Wait for RM to settle down on recovering containers;
waitForNumContainersToRecover(2, rm2, am1.getApplicationAttemptId());
Set<ContainerId> launchedContainers =
((RMNodeImpl) rm2.getRMContext().getRMNodes().get(nm1.getNodeId()))
.getLaunchedContainers();
assertTrue(launchedContainers.contains(amContainer.getContainerId()));
assertTrue(launchedContainers.contains(runningContainer.getContainerId()));
{code}
Am I missing anything?
> TestWorkPreservingRMRestart#testSchedulerRecovery fails intermittently in
> trunk for FS scheduler
> ------------------------------------------------------------------------------------------------
>
> Key: YARN-3790
> URL: https://issues.apache.org/jira/browse/YARN-3790
> Project: Hadoop YARN
> Issue Type: Bug
> Components: test
> Reporter: Rohith
> Assignee: zhihai xu
>
> Failure trace is as follows
> {noformat}
> Tests run: 28, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 284.078 sec
> <<< FAILURE! - in
> org.apache.hadoop.yarn.server.resourcemanager.TestWorkPreservingRMRestart
> testSchedulerRecovery[1](org.apache.hadoop.yarn.server.resourcemanager.TestWorkPreservingRMRestart)
> Time elapsed: 6.502 sec <<< FAILURE!
> java.lang.AssertionError: expected:<6144> but was:<8192>
> at org.junit.Assert.fail(Assert.java:88)
> at org.junit.Assert.failNotEquals(Assert.java:743)
> at org.junit.Assert.assertEquals(Assert.java:118)
> at org.junit.Assert.assertEquals(Assert.java:555)
> at org.junit.Assert.assertEquals(Assert.java:542)
> at
> org.apache.hadoop.yarn.server.resourcemanager.TestWorkPreservingRMRestart.assertMetrics(TestWorkPreservingRMRestart.java:853)
> at
> org.apache.hadoop.yarn.server.resourcemanager.TestWorkPreservingRMRestart.checkFSQueue(TestWorkPreservingRMRestart.java:342)
> at
> org.apache.hadoop.yarn.server.resourcemanager.TestWorkPreservingRMRestart.testSchedulerRecovery(TestWorkPreservingRMRestart.java:241)
> {noformat}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)