[ 
https://issues.apache.org/jira/browse/YARN-1533?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13856515#comment-13856515
 ] 

Liyin Liang commented on YARN-1533:
-----------------------------------

During TestDistributedShell.testDSShell(), the ApplicationMaster only ask for 
two containers from RM with following code:
{code}
    // Setup ask for containers from RM
    // Send request for containers to RM
    // Until we get our fully allocated quota, we keep on polling RM for
    // containers
    // Keep looping until all the containers are launched and shell script
    // executed on them ( regardless of success/failure).
    for (int i = 0; i < numTotalContainers; ++i) {
      ContainerRequest containerAsk = setupContainerAskForRM();
      amRMClient.addContainerRequest(containerAsk);
    }
{code}
But sometimes the app allocated three containers. Here is the callback handler 
log:
{code}
2013-12-20 16:44:21,327 INFO  [AMRM Callback Handler Thread] 
distributedshell.ApplicationMaster 
(ApplicationMaster.java:onContainersAllocated(638)) - Got response from RM for 
container ask, allocatedCnt=1
2013-12-20 16:44:22,342 INFO  [AMRM Callback Handler Thread] 
distributedshell.ApplicationMaster 
(ApplicationMaster.java:onContainersCompleted(582)) - Got response from RM for 
container ask, completedCnt=1
2013-12-20 16:44:22,343 INFO  [AMRM Callback Handler Thread] 
distributedshell.ApplicationMaster 
(ApplicationMaster.java:onContainersAllocated(638)) - Got response from RM for 
container ask, allocatedCnt=2
2013-12-20 16:44:23,345 INFO  [AMRM Callback Handler Thread] 
distributedshell.ApplicationMaster 
(ApplicationMaster.java:onContainersCompleted(582)) - Got response from RM for 
container ask, completedCnt=2
{code}
In this case, the DistributedShell App needs more time to finish and may time 
out.

> TestDistributedShell.testDSShell occasionally fails
> ---------------------------------------------------
>
>                 Key: YARN-1533
>                 URL: https://issues.apache.org/jira/browse/YARN-1533
>             Project: Hadoop YARN
>          Issue Type: Bug
>    Affects Versions: 2.2.0
>            Reporter: Liyin Liang
>
> TestApplicationCleanup is occasionally failing with the error:
> {code}
> -------------------------------------------------------------------------------
> Test set: 
> org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell
> -------------------------------------------------------------------------------
> Tests run: 1, Failures: 0, Errors: 1, Skipped: 0, Time elapsed: 114.163 sec 
> <<< FAILURE! - in 
> org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell
> testDSShell(org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell)
>   Time elapsed: 90.009 sec  <<< ERROR!
> java.lang.Exception: test timed out after 90000 milliseconds
>         at java.lang.Object.wait(Native Method)
>         at java.lang.Thread.join(Thread.java:1186)
>         at java.lang.Thread.join(Thread.java:1239)
>         at 
> org.apache.hadoop.yarn.applications.distributedshell.TestDistributedShell.testDSShell(TestDistributedShell.java:163)
> {code}



--
This message was sent by Atlassian JIRA
(v6.1.5#6160)

Reply via email to