[ 
https://issues.apache.org/jira/browse/YARN-11509?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17733292#comment-17733292
 ] 

ASF GitHub Bot commented on YARN-11509:
---------------------------------------

slfan1989 commented on code in PR #5727:
URL: https://github.com/apache/hadoop/pull/5727#discussion_r1231725983


##########
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-nodemanager/src/test/java/org/apache/hadoop/yarn/server/nodemanager/amrmproxy/TestFederationInterceptor.java:
##########
@@ -1432,4 +1432,53 @@ private void finishApplication() throws IOException, 
YarnException {
     Assert.assertNotNull(finishResponse);
     Assert.assertTrue(finishResponse.getIsUnregistered());
   }
+
+  @Test
+  public void testLaunchUAMAndRegisterApplicationMasterRetry() throws 
Exception {
+
+    UserGroupInformation ugi = 
interceptor.getUGIWithToken(interceptor.getAttemptId());
+    interceptor.setRetryCount(2);
+
+    ugi.doAs((PrivilegedExceptionAction<Object>) () -> {
+      // Register the application
+      RegisterApplicationMasterRequest registerReq =
+          Records.newRecord(RegisterApplicationMasterRequest.class);
+      registerReq.setHost(Integer.toString(testAppId));
+      registerReq.setRpcPort(0);
+      registerReq.setTrackingUrl("");
+
+      RegisterApplicationMasterResponse registerResponse =
+          interceptor.registerApplicationMaster(registerReq);
+      Assert.assertNotNull(registerResponse);
+      lastResponseId = 0;
+
+      Assert.assertEquals(0, interceptor.getUnmanagedAMPoolSize());
+
+      // Allocate the first batch of containers, with sc1 active
+      registerSubCluster(SubClusterId.newInstance("SC-1"));
+
+      int numberOfContainers = 3;
+      List<Container> containers = getContainersAndAssert(numberOfContainers, 
numberOfContainers);
+      Assert.assertEquals(1, interceptor.getUnmanagedAMPoolSize());
+
+      // Release all containers
+      releaseContainersAndAssert(containers);
+
+      // Finish the application
+      FinishApplicationMasterRequest finishReq =
+          Records.newRecord(FinishApplicationMasterRequest.class);
+      finishReq.setDiagnostics("");
+      finishReq.setTrackingUrl("");
+      finishReq.setFinalApplicationStatus(FinalApplicationStatus.SUCCEEDED);
+
+      FinishApplicationMasterResponse finishResponse =
+          interceptor.finishApplicationMaster(finishReq);
+      Assert.assertNotNull(finishResponse);
+      Assert.assertEquals(true, finishResponse.getIsUnregistered());

Review Comment:
   I will modify the code.





> The FederationInterceptor#launchUAM Added retry logic.
> ------------------------------------------------------
>
>                 Key: YARN-11509
>                 URL: https://issues.apache.org/jira/browse/YARN-11509
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: amrmproxy
>    Affects Versions: 3.4.0
>            Reporter: Shilun Fan
>            Assignee: Shilun Fan
>            Priority: Minor
>              Labels: pull-request-available
>
> There is a "todo" in the 
> FederationInterceptor#registerAndAllocateWithNewSubClusters method. According 
> to the "todo" description, the request needs to be retried to other 
> subclusters, but changing the parameter requests  in 
> registerAndAllocateWithNewSubClusters is not a good operation. It is better 
> to add retry logic here. 
> We don't need to worry about losing requests because when the request cannot 
> be satisfied, the AM of the task will continue to apply, and these requests 
> will be properly transferred to other clusters for execution.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to