[ https://issues.apache.org/jira/browse/YARN-4686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15200203#comment-15200203 ]
Jason Lowe commented on YARN-4686: ---------------------------------- bq. Still interested in if Jason Lowe or Karthik Kambatla have comments, especially about removal of the (extra) threads in startResourceManager and serviceStart methods. The thread removal is key, IMHO. MiniYARNCluster was a source of flaky tests because those threads allowed the mini cluster to return from its start method before its subcomponents completed their start methods. That means tests that assumed the cluster was started after cluster.start() were making a bad assumption. Removing these threads means the cluster really is started after the start method, assuming the RM and NM start methods correctly return only after they have started. +1 patch looks good to me. I'm OK either way on the blind or checked transition to active since it's a fast no-op in the non-HA case. It will generate an extra "Already in active state" info message in the test logs but is otherwise benign. > MiniYARNCluster.start() returns before cluster is completely started > -------------------------------------------------------------------- > > Key: YARN-4686 > URL: https://issues.apache.org/jira/browse/YARN-4686 > Project: Hadoop YARN > Issue Type: Bug > Components: test > Reporter: Rohith Sharma K S > Assignee: Eric Badger > Attachments: MAPREDUCE-6507.001.patch, YARN-4686.001.patch, > YARN-4686.002.patch, YARN-4686.003.patch, YARN-4686.004.patch, > YARN-4686.005.patch, YARN-4686.006.patch > > > TestRMNMInfo fails intermittently. Below is trace for the failure > {noformat} > testRMNMInfo(org.apache.hadoop.mapreduce.v2.TestRMNMInfo) Time elapsed: 0.28 > sec <<< FAILURE! > java.lang.AssertionError: Unexpected number of live nodes: expected:<4> but > was:<3> > at org.junit.Assert.fail(Assert.java:88) > at org.junit.Assert.failNotEquals(Assert.java:743) > at org.junit.Assert.assertEquals(Assert.java:118) > at org.junit.Assert.assertEquals(Assert.java:555) > at > org.apache.hadoop.mapreduce.v2.TestRMNMInfo.testRMNMInfo(TestRMNMInfo.java:111) > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)