[ https://issues.apache.org/jira/browse/YARN-11836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18009526#comment-18009526 ]
ASF GitHub Bot commented on YARN-11836: --------------------------------------- p-szucs commented on code in PR #7813: URL: https://github.com/apache/hadoop/pull/7813#discussion_r2228193192 ########## hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/webapp/util/WebAppUtils.java: ########## @@ -107,17 +107,29 @@ public static void setNMWebAppHostNameAndPort(Configuration conf, */ public static <T, R> R execOnActiveRM(Configuration conf, ThrowingBiFunction<String, T, R> func, T arg) throws Exception { - int haIndex = 0; + int activeRMIndex = 0; + int rmCount = 1; + if (HAUtil.isHAEnabled(conf)) { + ArrayList<String> rmIds = (ArrayList<String>) HAUtil.getRMHAIds(conf); + rmCount = rmIds.size(); String activeRMId = RMHAUtils.findActiveRMHAId(conf); if (activeRMId != null) { - haIndex = new ArrayList<>(HAUtil.getRMHAIds(conf)).indexOf(activeRMId); - } else { - throw new ConnectException("No Active RM available"); + activeRMIndex = rmIds.indexOf(activeRMId); + } + } + + // In HA mode activeRMId can be fetched only if user have permission to check service states. + // Otherwise, we find the active one by iterating through the RMs + for (int i = activeRMIndex; i < rmCount; i++) { + try { + String rmAddress = getRMWebAppURLWithScheme(conf, i); + return func.apply(rmAddress, arg); + } catch (Exception e) { + // Ignore and try next RM if there are any Review Comment: Fixed > YARN CLI fails to fetch logs with "-am" option if user is not in Admin ACLs > --------------------------------------------------------------------------- > > Key: YARN-11836 > URL: https://issues.apache.org/jira/browse/YARN-11836 > Project: Hadoop YARN > Issue Type: Bug > Components: yarn-common > Affects Versions: 3.4.0, 3.4.1 > Reporter: Peter Szucs > Assignee: Peter Szucs > Priority: Major > Labels: pull-request-available > > YARN-10767 introduced a bug, where YARN Logs CLI is unable to fetch the AM > logs using "-am" option if the user is not in the Admin ACLs. > This commit changed the logic for requesting the AM logs and it fetches the > "id" of the active RM from the HA service, and requesting the logs from there. > > *Reproduction:* > The issue can be reproduced by calling "{_}yarn logs -applicationId ‹appId› > -am 1{_}" command with a user who has not got admin access. > In the RM logs of the test cluster I can see the following error, which > states that the user doesn't have permission to call '{_}getServiceState{_}': > {code:java} > IPC Server handler 0 on default port 8033, call Call#3 Retry#0 > org.apache.hadoop.ha.HAServiceProtocol.getServiceStatus > org.apache.hadoop.security.AccessControlException: User systest doesn't have > permission to call 'getServiceState' > at > org.apache.hadoop.yarn.server.resourcemanager.RMServerUtils.verifyAdminAccess(RMServerUtils.java:433) > at > org.apache.hadoop.yarn.server.resourcemanager.RMServerUtils.verifyAdminAccess(RMServerUtils.java:398) > at > org.apache.hadoop.yarn.server.resourcemanager.AdminService.checkAccess(AdminService.java:243) > at > org.apache.hadoop.yarn.server.resourcemanager.AdminService.getServiceStatus(AdminService.java:396) > at > org.apache.hadoop.ha.protocolPB.HAServiceProtocolServerSideTranslatorPB.getServiceStatus(HAServiceProtocolServerSideTranslatorPB.java:148) > at > org.apache.hadoop.ha.proto.HAServiceProtocolProtos$HAServiceProtocolService$2.callBlockingMethod(HAServiceProtocolProtos.java:6154) > at > org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:621) > at > org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:589) > at > org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:573) > at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1227) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1247) > at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1170) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:422) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1964) > at org.apache.hadoop.ipc.Server$Handler.run(Server.java:3200){code} > > *Full call stack for reference:* > LogsCli.getAMContainerInfoForRMWebService -› > WebAppUtils.execOnActiveRM -› > RMHAUtils.findActiveRMHAId(conf) -› > RMHAUtils.getHAState -› > proto.getServiceStatus().getState() -› > AdminService.getServiceStatus -› > AdminService.checkAccess > > Currently in {_}WebAppUtils{_}'s _execOnActiveRM_ method we throw an > exception when _RMHAUtils.findActiveRMHAId_ returns null > [here|https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/webapp/util/WebAppUtils.java#L116], > stating that "No active RM is available". However that method will return > null if the permissions are missing to check the service states. I think at > this point we could fall back to the original code here, and try to find the > active RM by iterating through them. > The issue only happens in HA mode, and only if we use "{_}-am{_}" option, > without this option the AM logs can be retrieved together with the aggregated > logs. -- This message was sent by Atlassian Jira (v8.20.10#820010) --------------------------------------------------------------------- To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org