[ 
https://issues.apache.org/jira/browse/YARN-4385?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15065598#comment-15065598
 ] 

Naganarasimha G R commented on YARN-4385:
-----------------------------------------

Faced one more intermittent failure in 2928 branch but not related to ATS v2 
code
{code}
------------------------------------------------------
 T E S T S
-------------------------------------------------------
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; 
support was removed in 8.0
Running 
org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell
Tests run: 13, Failures: 1, Errors: 0, Skipped: 0, Time elapsed: 476.165 sec 
<<< FAILURE! - in 
org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell
testDSShellWithDomain(org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell)
  Time elapsed: 29.211 sec  <<< FAILURE!
java.lang.AssertionError: expected:<2> but was:<3>
        at org.junit.Assert.fail(Assert.java:88)
        at org.junit.Assert.failNotEquals(Assert.java:743)
        at org.junit.Assert.assertEquals(Assert.java:118)
        at org.junit.Assert.assertEquals(Assert.java:555)
        at org.junit.Assert.assertEquals(Assert.java:542)
        at 
org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.checkTimelineV1(TestDistributedShell.java:356)
        at 
org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.testDSShell(TestDistributedShell.java:317)
        at 
org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.testDSShellWithDomain(TestDistributedShell.java:195)

Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; 
support was removed in 8.0
Running 
org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShellWithNodeLabels
Tests run: 1, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 39.703 sec - in 
org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShellWithNodeLabels
Java HotSpot(TM) 64-Bit Server VM warning: ignoring option MaxPermSize=768m; 
support was removed in 8.0
Running org.apache.hadoop.yarn.applications.distributedshell.TestDSAppMaster
Tests run: 2, Failures: 0, Errors: 0, Skipped: 0, Time elapsed: 2.508 sec - in 
org.apache.hadoop.yarn.applications.distributedshell.TestDSAppMaster

Results :

Failed tests: 
  
TestDistributedShell.testDSShellWithDomain:195->testDSShell:317->checkTimelineV1:356
 expected:<2> but was:<3>

Tests run: 16, Failures: 1, Errors: 0, Skipped: 0
{code}
{{TestDistributedShell.checkTimelineV1}} checks whether only 2 (requested) 
containers are being launched. But in reality more than 2 are getting launched. 
possible reasons for it are :
* when RM has assigned additional containers and the Distributed shell AM is 
launching it. I had observed similar behavior of over assigning in MR also but 
MR AM takes care returning the extra apps assigned by the RM. Similar approach 
should exist in Distributed shell AM too.
* container has been killed for some reason and extra Container is started

Not sure which of these cases is causing the assigning of additional 
containers, to analyze this we require more RM and AM logs.
Possible solutions are :
* Instead of checking only 2 we can check for at least 2, so that test case 
will not fail if more than 2 containers are launched
* Try to ensure not more than desired containers are launched even though RM 
allocates more containers 
 

> TestDistributedShell times out
> ------------------------------
>
>                 Key: YARN-4385
>                 URL: https://issues.apache.org/jira/browse/YARN-4385
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: test
>            Reporter: Tsuyoshi Ozawa
>            Assignee: Naganarasimha G R
>         Attachments: 
> org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell-output.txt
>
>




--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to