[ 
https://issues.apache.org/jira/browse/YARN-11836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18009526#comment-18009526
 ] 

ASF GitHub Bot commented on YARN-11836:
---------------------------------------

p-szucs commented on code in PR #7813:
URL: https://github.com/apache/hadoop/pull/7813#discussion_r2228193192


##########
hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/webapp/util/WebAppUtils.java:
##########
@@ -107,17 +107,29 @@ public static void 
setNMWebAppHostNameAndPort(Configuration conf,
    */
   public static <T, R> R execOnActiveRM(Configuration conf,
       ThrowingBiFunction<String, T, R> func, T arg) throws Exception {
-    int haIndex = 0;
+    int activeRMIndex = 0;
+    int rmCount = 1;
+
     if (HAUtil.isHAEnabled(conf)) {
+      ArrayList<String> rmIds = (ArrayList<String>) HAUtil.getRMHAIds(conf);
+      rmCount = rmIds.size();
       String activeRMId = RMHAUtils.findActiveRMHAId(conf);
       if (activeRMId != null) {
-        haIndex = new ArrayList<>(HAUtil.getRMHAIds(conf)).indexOf(activeRMId);
-      } else {
-        throw new ConnectException("No Active RM available");
+        activeRMIndex = rmIds.indexOf(activeRMId);
+      }
+    }
+
+    // In HA mode activeRMId can be fetched only if user have permission to 
check service states.
+    // Otherwise, we find the active one by iterating through the RMs
+    for (int i = activeRMIndex; i < rmCount; i++) {
+      try {
+        String rmAddress = getRMWebAppURLWithScheme(conf, i);
+        return func.apply(rmAddress, arg);
+      } catch (Exception e) {
+        // Ignore and try next RM if there are any

Review Comment:
   Fixed





> YARN CLI fails to fetch logs with "-am" option if user is not in Admin ACLs
> ---------------------------------------------------------------------------
>
>                 Key: YARN-11836
>                 URL: https://issues.apache.org/jira/browse/YARN-11836
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: yarn-common
>    Affects Versions: 3.4.0, 3.4.1
>            Reporter: Peter Szucs
>            Assignee: Peter Szucs
>            Priority: Major
>              Labels: pull-request-available
>
> YARN-10767 introduced a bug, where YARN Logs CLI is unable to fetch the AM 
> logs using "-am" option if the user is not in the Admin ACLs.
> This commit changed the logic for requesting the AM logs and it fetches the 
> "id" of the active RM from the HA service, and requesting the logs from there.
>  
> *Reproduction:*
> The issue can be reproduced by calling "{_}yarn logs -applicationId ‹appId› 
> -am 1{_}" command with a user who has not got admin access.
> In the RM logs of the test cluster I can see the following error, which 
> states that the user doesn't have permission to call '{_}getServiceState{_}':
> {code:java}
> IPC Server handler 0 on default port 8033, call Call#3 Retry#0 
> org.apache.hadoop.ha.HAServiceProtocol.getServiceStatus
> org.apache.hadoop.security.AccessControlException: User systest doesn't have 
> permission to call 'getServiceState'
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMServerUtils.verifyAdminAccess(RMServerUtils.java:433)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMServerUtils.verifyAdminAccess(RMServerUtils.java:398)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.checkAccess(AdminService.java:243)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.getServiceStatus(AdminService.java:396)
> at 
> org.apache.hadoop.ha.protocolPB.HAServiceProtocolServerSideTranslatorPB.getServiceStatus(HAServiceProtocolServerSideTranslatorPB.java:148)
> at 
> org.apache.hadoop.ha.proto.HAServiceProtocolProtos$HAServiceProtocolService$2.callBlockingMethod(HAServiceProtocolProtos.java:6154)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:621)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:589)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:573)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1227)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1247)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1170)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1964)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:3200){code}
>  
> *Full call stack for reference:*
> LogsCli.getAMContainerInfoForRMWebService -›
> WebAppUtils.execOnActiveRM -›
> RMHAUtils.findActiveRMHAId(conf) -›
> RMHAUtils.getHAState -›
> proto.getServiceStatus().getState() -›
> AdminService.getServiceStatus -›
> AdminService.checkAccess
>  
> Currently in {_}WebAppUtils{_}'s _execOnActiveRM_ method we throw an 
> exception when _RMHAUtils.findActiveRMHAId_ returns null 
> [here|https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/webapp/util/WebAppUtils.java#L116],
>  stating that "No active RM is available". However that method will return 
> null if the permissions are missing to check the service states. I think at 
> this point we could fall back to the original code here, and try to find the 
> active RM by iterating through them. 
> The issue only happens in HA mode, and only if we use "{_}-am{_}" option, 
> without this option the AM logs can be retrieved together with the aggregated 
> logs.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to