[ 
https://issues.apache.org/jira/browse/YARN-4686?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15200203#comment-15200203
 ] 

Jason Lowe commented on YARN-4686:
----------------------------------

bq. Still interested in if Jason Lowe or Karthik Kambatla have comments, 
especially about removal of the (extra) threads in startResourceManager and 
serviceStart methods.

The thread removal is key, IMHO.  MiniYARNCluster was a source of flaky tests 
because those threads allowed the mini cluster to return from its start method 
before its subcomponents completed their start methods.  That means tests that 
assumed the cluster was started after cluster.start() were making a bad 
assumption.  Removing these threads means the cluster really is started after 
the start method, assuming the RM and NM start methods correctly return only 
after they have started.

+1 patch looks good to me.  I'm OK either way on the blind or checked 
transition to active since it's a fast no-op in the non-HA case.  It will 
generate an extra "Already in active state" info message in the test logs but 
is otherwise benign.


> MiniYARNCluster.start() returns before cluster is completely started
> --------------------------------------------------------------------
>
>                 Key: YARN-4686
>                 URL: https://issues.apache.org/jira/browse/YARN-4686
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: test
>            Reporter: Rohith Sharma K S
>            Assignee: Eric Badger
>         Attachments: MAPREDUCE-6507.001.patch, YARN-4686.001.patch, 
> YARN-4686.002.patch, YARN-4686.003.patch, YARN-4686.004.patch, 
> YARN-4686.005.patch, YARN-4686.006.patch
>
>
> TestRMNMInfo fails intermittently. Below is trace for the failure
> {noformat}
> testRMNMInfo(org.apache.hadoop.mapreduce.v2.TestRMNMInfo)  Time elapsed: 0.28 
> sec  <<< FAILURE!
> java.lang.AssertionError: Unexpected number of live nodes: expected:<4> but 
> was:<3>
>       at org.junit.Assert.fail(Assert.java:88)
>       at org.junit.Assert.failNotEquals(Assert.java:743)
>       at org.junit.Assert.assertEquals(Assert.java:118)
>       at org.junit.Assert.assertEquals(Assert.java:555)
>       at 
> org.apache.hadoop.mapreduce.v2.TestRMNMInfo.testRMNMInfo(TestRMNMInfo.java:111)
> {noformat}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to