[ 
https://issues.apache.org/jira/browse/YARN-11566?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17765123#comment-17765123
 ] 

ASF GitHub Bot commented on YARN-11566:
---------------------------------------

slfan1989 commented on code in PR #6068:
URL: https://github.com/apache/hadoop/pull/6068#discussion_r1325821608


##########
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-server/hadoop-yarn-server-common/src/main/java/org/apache/hadoop/yarn/server/uam/UnmanagedAMPoolManager.java:
##########
@@ -485,50 +488,33 @@ public Map<String, FinishApplicationMasterResponse> 
batchFinishApplicationMaster
     return responseMap;
   }
 
-  Runnable createForceFinishApplicationThread() {
+  Runnable createForceFinishApplicationThread(
+      Map<String, UnmanagedApplicationManager> unmanagedAppToFinish) {
     return () -> {
-
-      ExecutorCompletionService<Pair<String, KillApplicationResponse>> 
completionService =
-          new ExecutorCompletionService<>(threadpool);
-
-      // Save a local copy of the key set so that it won't change with the map
-      Set<String> addressList = new HashSet<>(unmanagedAppMasterMap.keySet());
-
-      LOG.warn("Abnormal shutdown of UAMPoolManager, still {} UAMs in map", 
addressList.size());
-
-      for (final String uamId : addressList) {
-        completionService.submit(() -> {
-          try {
-            ApplicationId appId = appIdMap.get(uamId);
-            LOG.info("Force-killing UAM id {} for application {}", uamId, 
appId);
-            UnmanagedApplicationManager applicationManager = 
unmanagedAppMasterMap.remove(uamId);
-            KillApplicationResponse response = 
applicationManager.forceKillApplication();
-            return Pair.of(uamId, response);
-          } catch (Exception e) {
-            LOG.error("Failed to kill unmanaged application master", e);
-            return Pair.of(uamId, null);
-          }
-        });
-      }
-
-      for (int i = 0; i < addressList.size(); ++i) {
+      LOG.warn("Abnormal shutdown of UAMPoolManager, still {} UAMs in map",
+          unmanagedAppToFinish.size());
+
+      // We should not use threadpool to execute the operation 
'forceKillApplication'.
+      // Because ForceFinishApplication run in asynchronous thread, threadpool 
may be destroyed.
+      // Since we kill app in asynchronous thread, we could kill app 
sequentially,
+      // no operations will get stuck.
+      for (Map.Entry<String, UnmanagedApplicationManager> entry : 
unmanagedAppToFinish.entrySet()) {
+        String uamId = entry.getKey();
+        UnmanagedApplicationManager applicationManager = entry.getValue();
         try {
-          Future<Pair<String, KillApplicationResponse>> future = 
completionService.take();
-          Pair<String, KillApplicationResponse> pairs = future.get();
-          String uamId = pairs.getLeft();
           ApplicationId appId = appIdMap.get(uamId);
-          KillApplicationResponse response = pairs.getRight();
+          LOG.info("Force-killing UAM id {} for application {}", uamId, appId);
+          KillApplicationResponse response = 
applicationManager.forceKillApplication();
           if (response == null) {
-            throw new YarnException(
-                "Failed Force-killing UAM id " + uamId + " for application " + 
appId);
+            LOG.error("Failed Force-killing UAM id " + uamId + " for 
application " + appId);

Review Comment:
   LOG.error("Failed Force-killing UAM id {} for application {}.", appId);





> Yarn app kill command can not kill the application in secondary sub cluster.
> ----------------------------------------------------------------------------
>
>                 Key: YARN-11566
>                 URL: https://issues.apache.org/jira/browse/YARN-11566
>             Project: Hadoop YARN
>          Issue Type: Improvement
>            Reporter: zhengchenyu
>            Assignee: zhengchenyu
>            Priority: Major
>              Labels: pull-request-available
>
> When AMRMProxy is enable, the application may allocate container among multi 
> sub cluster. The application in secondary sub cluster will be labeled as 
> unmananged application. When we run 'yarn app -kill \{appid}', the unmananged 
> application will not be killed in secondary sub cluster.
> The unmanaged application will be removed util app attempt is expired after 
> 15 minute.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to