[ 
https://issues.apache.org/jira/browse/YARN-2666?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14389102#comment-14389102
 ] 

zhihai xu commented on YARN-2666:
---------------------------------

Thanks [~ywskycn] to assign this JIRA to me. I uploaded a patch 
YARN-2666.000.patch for review.
The patch will make sure testContinuousScheduling doesn't depend on timing by 
assigning the second allocation request to the node with more available 
resource. Doesn't matter which node the first allocation request is assigned 
to. Before the second allocation request is called, stop the continuous 
scheduler thread. After the second allocation request is called, start the 
continuous scheduler thread. the scheduler will sort the node based on the 
available resource before assign container to nodes.  In this case, the 
node,which the first allocation request was assigned to, has less available 
resource.  The second allocation request will be assigned to the node with more 
available resource. So they will be assigned to different nodes.
The patch didn't touch any code except TestFairScheduler, so the test 
failure(TestAMRMTokens) is not related to my patch.
I hit this issue twice yesterday. This Intermittent failure exists for long 
time, It will be better to fix it.
Hi [~ozawa], Could you review the patch? many thanks
 

> TestFairScheduler.testContinuousScheduling fails Intermittently
> ---------------------------------------------------------------
>
>                 Key: YARN-2666
>                 URL: https://issues.apache.org/jira/browse/YARN-2666
>             Project: Hadoop YARN
>          Issue Type: Test
>          Components: scheduler
>            Reporter: Tsuyoshi Ozawa
>            Assignee: zhihai xu
>         Attachments: YARN-2666.000.patch
>
>
> The test fails on trunk.
> {code}
> Tests run: 79, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 8.698 sec 
> <<< FAILURE! - in 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler
> testContinuousScheduling(org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler)
>   Time elapsed: 0.582 sec  <<< FAILURE!
> java.lang.AssertionError: expected:<2> but was:<1>
>       at org.junit.Assert.fail(Assert.java:88)
>       at org.junit.Assert.failNotEquals(Assert.java:743)
>       at org.junit.Assert.assertEquals(Assert.java:118)
>       at org.junit.Assert.assertEquals(Assert.java:555)
>       at org.junit.Assert.assertEquals(Assert.java:542)
>       at 
> org.apache.hadoop.yarn.server.resourcemanager.scheduler.fair.TestFairScheduler.testContinuousScheduling(TestFairScheduler.java:3372)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to