[
https://issues.apache.org/jira/browse/YARN-11323?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17612037#comment-17612037
]
ASF GitHub Bot commented on YARN-11323:
---------------------------------------
slfan1989 commented on code in PR #4954:
URL: https://github.com/apache/hadoop/pull/4954#discussion_r985190116
##########
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/federation/retry/FederationActionRetry.java:
##########
@@ -0,0 +1,46 @@
+/**
+ * Licensed to the Apache Software Foundation (ASF) under one or more
+ * contributor license agreements. See the NOTICE file distributed with this
+ * work for additional information regarding copyright ownership. The ASF
+ * licenses this file to you under the Apache License, Version 2.0 (the
+ * "License"); you may not use this file except in compliance with the License.
+ * You may obtain a copy of the License at
+ *
+ * http://www.apache.org/licenses/LICENSE-2.0
+ *
+ * Unless required by applicable law or agreed to in writing, software
+ * distributed under the License is distributed on an "AS IS" BASIS, WITHOUT
+ * WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the
+ * License for the specific language governing permissions and limitations
under
+ * the License.
+ */
+
+package org.apache.hadoop.yarn.server.federation.retry;
+
+import org.slf4j.Logger;
+import org.slf4j.LoggerFactory;
+
+public abstract class FederationActionRetry<T> {
+
+ public static final Logger LOG =
Review Comment:
I will fix it.
##########
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-resourcemanager/src/test/java/org/apache/hadoop/yarn/server/resourcemanager/federation/TestFederationRMStateStoreService.java:
##########
@@ -207,4 +233,254 @@ public void
testFederationStateStoreServiceInitialHeartbeatDelay() throws Except
"Started federation membership heartbeat with interval: 300 and
initial delay: 10"));
rm.stop();
}
+
+ @Test
+ public void testCleanUpApplication() throws Exception {
+
+ // set yarn configuration
+ conf.setBoolean(YarnConfiguration.FEDERATION_ENABLED, true);
+
conf.setInt(YarnConfiguration.FEDERATION_STATESTORE_HEARTBEAT_INITIAL_DELAY,
10);
+ conf.set(YarnConfiguration.RM_CLUSTER_ID, subClusterId.getId());
+
+ // set up MockRM
+ final MockRM rm = new MockRM(conf);
+ rm.init(conf);
+ stateStore = rm.getFederationStateStoreService().getStateStoreClient();
+ rm.start();
+
+ // init subCluster Heartbeat,
+ // and check that the subCluster is in a running state
+ FederationStateStoreService stateStoreService =
+ rm.getFederationStateStoreService();
+ FederationStateStoreHeartbeat storeHeartbeat =
+ stateStoreService.getStateStoreHeartbeatThread();
+ storeHeartbeat.run();
+ checkSubClusterInfo(SubClusterState.SC_RUNNING);
+
+ // generate an application and join the [SC-1] cluster
+ ApplicationId appId = ApplicationId.newInstance(Time.now(), 1);
+ addApplication2StateStore(appId, stateStore);
+
+ // make sure the app can be queried in the stateStore
+ GetApplicationHomeSubClusterRequest appRequest =
+ GetApplicationHomeSubClusterRequest.newInstance(appId);
+ GetApplicationHomeSubClusterResponse response =
+ stateStore.getApplicationHomeSubCluster(appRequest);
+ Assert.assertNotNull(response);
+ ApplicationHomeSubCluster appHomeSubCluster =
response.getApplicationHomeSubCluster();
+ Assert.assertNotNull(appHomeSubCluster);
+ Assert.assertNotNull(appHomeSubCluster.getApplicationId());
+ Assert.assertEquals(appId, appHomeSubCluster.getApplicationId());
+
+ // clean up the app.
+ boolean cleanUpResult =
+ stateStoreService.cleanUpFinishApplicationsWithRetries(appId, true);
+ Assert.assertTrue(cleanUpResult);
+
+ // after clean, the app can no longer be queried from the stateStore.
+ LambdaTestUtils.intercept(FederationStateStoreException.class,
+ "Application " + appId + " does not exist",
+ () -> stateStore.getApplicationHomeSubCluster(appRequest));
+
+ }
+
+ @Test
+ public void testCleanUpApplicationWhenRMStart() throws Exception {
+
+ // We design such a test case.
+ // Step1. We add app01, app02, app03 to the stateStore,
+ // But these apps are not in RM's RMContext, they are finished apps
+ // Step2. We simulate RM startup, there is only app04 in RMContext.
+ // Step3. We wait for 5 seconds, the automatic cleanup thread should clean
up finished apps.
+
+ // set yarn configuration.
+ conf.setBoolean(YarnConfiguration.FEDERATION_ENABLED, true);
+
conf.setInt(YarnConfiguration.FEDERATION_STATESTORE_HEARTBEAT_INITIAL_DELAY,
10);
+ conf.set(YarnConfiguration.RM_CLUSTER_ID, subClusterId.getId());
+ conf.setBoolean(YarnConfiguration.RECOVERY_ENABLED, true);
+
+ // set up MockRM.
+ MockRM rm = new MockRM(conf);
+ rm.init(conf);
+ stateStore = rm.getFederationStateStoreService().getStateStoreClient();
+
+ // generate an [app01] and join the [SC-1] cluster.
+ List<ApplicationId> appIds = new ArrayList();
Review Comment:
I will fix it.
> [Federation] Improve Router Handler FinishApps
> ----------------------------------------------
>
> Key: YARN-11323
> URL: https://issues.apache.org/jira/browse/YARN-11323
> Project: Hadoop YARN
> Issue Type: Improvement
> Components: federation, router, yarn
> Affects Versions: 3.4.0
> Reporter: fanshilun
> Assignee: fanshilun
> Priority: Major
> Labels: pull-request-available
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]