[
https://issues.apache.org/jira/browse/YARN-5317?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15373036#comment-15373036
]
sandflee commented on YARN-5317:
--------------------------------
sendAMLaunched is mainly used in two scene:
1,submit app, and send am launched.
{code}
RMApp app1 = rm.submitApp(testAlloc);
nm1.nodeHeartbeat(true);
RMAppAttempt attempt1 = app1.getCurrentAppAttempt();
MockAM am1 = rm.sendAMLaunched(attempt1.getAppAttemptId());
{code}
this is ok, because after submitApp, app becomes ACCEPTED, appAttempt becomes
SCHEDULED
2, am container complete, and send am launched, this should explicitly wait
appAttempt becomes SCHEDULED before send node heartbeat(or use
mockRM#launchAM). this seems just happens in testAMRestart
bq. Also I noticed the same code snippet appears in
testAMRestartWithExistingContainers, so I'm wondering if it could have the same
issue.
yes, it have the same issue, but in this test, it sleep 3s after am container
complete, it's enough for appAttempt becomes SCHEDULED. but it's reasonable
to add this check
> testAMRestartNotLostContainerCompleteMsg may fail
> -------------------------------------------------
>
> Key: YARN-5317
> URL: https://issues.apache.org/jira/browse/YARN-5317
> Project: Hadoop YARN
> Issue Type: Test
> Reporter: sandflee
> Assignee: sandflee
> Priority: Minor
> Attachments: YARN-5317.01.patch
>
>
> java.lang.Exception: test timed out after 30000 milliseconds
> at java.lang.Thread.sleep(Native Method)
> at
> org.apache.hadoop.yarn.server.resourcemanager.MockRM.waitForState(MockRM.java:261)
> at
> org.apache.hadoop.yarn.server.resourcemanager.MockRM.waitForState(MockRM.java:225)
> at
> org.apache.hadoop.yarn.server.resourcemanager.MockRM.waitForState(MockRM.java:207)
> at
> org.apache.hadoop.yarn.server.resourcemanager.MockRM.sendAMLaunched(MockRM.java:746)
> at
> org.apache.hadoop.yarn.server.resourcemanager.applicationsmanager.TestAMRestart.testAMRestartNotLostContainerCompleteMsg(TestAMRestart.java:841)
> see
> https://builds.apache.org/job/PreCommit-YARN-Build/12204/testReport/org.apache.hadoop.yarn.server.resourcemanager.applicationsmanager/TestAMRestart/testAMRestartNotLostContainerCompleteMsg/
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]