[ 
https://issues.apache.org/jira/browse/YARN-11836?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Peter Szucs updated YARN-11836:
-------------------------------
    Description: 
YARN-10767 introduced a bug, where YARN Logs CLI is unable to fetch the AM logs 
using "-am" option if the user is not in the Admin ACLs.

This commit changed the logic for requesting the AM logs and it fetches the 
"id" of the active RM from the HA service, and requesting the logs from there.

 

*Reproduction:*

The issue can be reproduced by calling "{_}yarn logs -applicationId ‹appId› -am 
1{_}" command with a user who has not got admin access.

In the RM logs of the test cluster I can see the following error, which states 
that the user doesn't have permission to call '{_}getServiceState{_}':
{code:java}
IPC Server handler 0 on default port 8033, call Call#3 Retry#0 
org.apache.hadoop.ha.HAServiceProtocol.getServiceStatus
org.apache.hadoop.security.AccessControlException: User systest doesn't have 
permission to call 'getServiceState'
at 
org.apache.hadoop.yarn.server.resourcemanager.RMServerUtils.verifyAdminAccess(RMServerUtils.java:433)
at 
org.apache.hadoop.yarn.server.resourcemanager.RMServerUtils.verifyAdminAccess(RMServerUtils.java:398)
at 
org.apache.hadoop.yarn.server.resourcemanager.AdminService.checkAccess(AdminService.java:243)
at 
org.apache.hadoop.yarn.server.resourcemanager.AdminService.getServiceStatus(AdminService.java:396)
at 
org.apache.hadoop.ha.protocolPB.HAServiceProtocolServerSideTranslatorPB.getServiceStatus(HAServiceProtocolServerSideTranslatorPB.java:148)
at 
org.apache.hadoop.ha.proto.HAServiceProtocolProtos$HAServiceProtocolService$2.callBlockingMethod(HAServiceProtocolProtos.java:6154)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:621)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:589)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:573)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1227)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1247)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1170)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1964)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:3200){code}
 

*Full call stack for reference:*

LogsCli.getAMContainerInfoForRMWebService -›

WebAppUtils.execOnActiveRM -›

RMHAUtils.findActiveRMHAId(conf) -›

RMHAUtils.getHAState -›

proto.getServiceStatus().getState() -›

AdminService.getServiceStatus -›

AdminService.checkAccess

 

Currently in {_}WebAppUtils{_}'s _execOnActiveRM_ method we throw an exception 
when _RMHAUtils.findActiveRMHAId_ returns null 
[here|https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/webapp/util/WebAppUtils.java#L116],
 stating that "No active RM is available". However that method will return null 
if the permissions are missing to check the service states. I think at this 
point we could fall back to the original code here, and try to find the active 
RM by iterating through them. 

The issue only happens in HA mode, and only if we use "{_}-am{_}" option, 
without this option the AM logs can be retrieved together with the aggregated 
logs.

  was:
YARN-10767 introduced a bug, where YARN Logs CLI is unable to fetch the AM logs 
using "-am" option if the user is not in the Admin ACLs.

This commit changed the logic for requesting the AM logs and it fetches the 
"id" of the active RM from the HA service, and requesting the logs from there.

 

*Reproduction:*

The issue can be reproduced by calling "{_}yarn logs -applicationId ‹appId› -am 
1{_}" command with a user who has not got admin access.

In the RM logs of the test cluster I can see the following error, which states 
that the user doesn't have permission to call '{_}getServiceState{_}':
{code:java}
IPC Server handler 0 on default port 8033, call Call#3 Retry#0 
org.apache.hadoop.ha.HAServiceProtocol.getServiceStatus from 10.140.91.1:44484
org.apache.hadoop.security.AccessControlException: User systest doesn't have 
permission to call 'getServiceState'
at 
org.apache.hadoop.yarn.server.resourcemanager.RMServerUtils.verifyAdminAccess(RMServerUtils.java:433)
at 
org.apache.hadoop.yarn.server.resourcemanager.RMServerUtils.verifyAdminAccess(RMServerUtils.java:398)
at 
org.apache.hadoop.yarn.server.resourcemanager.AdminService.checkAccess(AdminService.java:243)
at 
org.apache.hadoop.yarn.server.resourcemanager.AdminService.getServiceStatus(AdminService.java:396)
at 
org.apache.hadoop.ha.protocolPB.HAServiceProtocolServerSideTranslatorPB.getServiceStatus(HAServiceProtocolServerSideTranslatorPB.java:148)
at 
org.apache.hadoop.ha.proto.HAServiceProtocolProtos$HAServiceProtocolService$2.callBlockingMethod(HAServiceProtocolProtos.java:6154)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:621)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:589)
at 
org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:573)
at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1227)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1247)
at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1170)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at 
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1964)
at org.apache.hadoop.ipc.Server$Handler.run(Server.java:3200){code}
 

*Full call stack for reference:*

LogsCli.getAMContainerInfoForRMWebService -›

WebAppUtils.execOnActiveRM -›

RMHAUtils.findActiveRMHAId(conf) -›

RMHAUtils.getHAState -›

proto.getServiceStatus().getState() -›

AdminService.getServiceStatus -›

AdminService.checkAccess

 

Currently in {_}WebAppUtils{_}'s _execOnActiveRM_ method we throw an exception 
when _RMHAUtils.findActiveRMHAId_ returns null 
[here|https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/webapp/util/WebAppUtils.java#L116],
 stating that "No active RM is available". However that method will return null 
if the permissions are missing to check the service states. I think at this 
point we could fall back to the original code here, and try to find the active 
RM by iterating through them. 

The issue only happens in HA mode, and only if we use "{_}-am{_}" option, 
without this option the AM logs can be retrieved together with the aggregated 
logs.


> YARN CLI fails to fetch logs with "-am" option if user is not in Admin ACLs
> ---------------------------------------------------------------------------
>
>                 Key: YARN-11836
>                 URL: https://issues.apache.org/jira/browse/YARN-11836
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: yarn-common
>    Affects Versions: 3.4.0, 3.4.1
>            Reporter: Peter Szucs
>            Assignee: Peter Szucs
>            Priority: Major
>
> YARN-10767 introduced a bug, where YARN Logs CLI is unable to fetch the AM 
> logs using "-am" option if the user is not in the Admin ACLs.
> This commit changed the logic for requesting the AM logs and it fetches the 
> "id" of the active RM from the HA service, and requesting the logs from there.
>  
> *Reproduction:*
> The issue can be reproduced by calling "{_}yarn logs -applicationId ‹appId› 
> -am 1{_}" command with a user who has not got admin access.
> In the RM logs of the test cluster I can see the following error, which 
> states that the user doesn't have permission to call '{_}getServiceState{_}':
> {code:java}
> IPC Server handler 0 on default port 8033, call Call#3 Retry#0 
> org.apache.hadoop.ha.HAServiceProtocol.getServiceStatus
> org.apache.hadoop.security.AccessControlException: User systest doesn't have 
> permission to call 'getServiceState'
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMServerUtils.verifyAdminAccess(RMServerUtils.java:433)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.RMServerUtils.verifyAdminAccess(RMServerUtils.java:398)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.checkAccess(AdminService.java:243)
> at 
> org.apache.hadoop.yarn.server.resourcemanager.AdminService.getServiceStatus(AdminService.java:396)
> at 
> org.apache.hadoop.ha.protocolPB.HAServiceProtocolServerSideTranslatorPB.getServiceStatus(HAServiceProtocolServerSideTranslatorPB.java:148)
> at 
> org.apache.hadoop.ha.proto.HAServiceProtocolProtos$HAServiceProtocolService$2.callBlockingMethod(HAServiceProtocolProtos.java:6154)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:621)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:589)
> at 
> org.apache.hadoop.ipc.ProtobufRpcEngine2$Server$ProtoBufRpcInvoker.call(ProtobufRpcEngine2.java:573)
> at org.apache.hadoop.ipc.RPC$Server.call(RPC.java:1227)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1247)
> at org.apache.hadoop.ipc.Server$RpcCall.run(Server.java:1170)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1964)
> at org.apache.hadoop.ipc.Server$Handler.run(Server.java:3200){code}
>  
> *Full call stack for reference:*
> LogsCli.getAMContainerInfoForRMWebService -›
> WebAppUtils.execOnActiveRM -›
> RMHAUtils.findActiveRMHAId(conf) -›
> RMHAUtils.getHAState -›
> proto.getServiceStatus().getState() -›
> AdminService.getServiceStatus -›
> AdminService.checkAccess
>  
> Currently in {_}WebAppUtils{_}'s _execOnActiveRM_ method we throw an 
> exception when _RMHAUtils.findActiveRMHAId_ returns null 
> [here|https://github.com/apache/hadoop/blob/trunk/hadoop-yarn-project/hadoop-yarn/hadoop-yarn-common/src/main/java/org/apache/hadoop/yarn/webapp/util/WebAppUtils.java#L116],
>  stating that "No active RM is available". However that method will return 
> null if the permissions are missing to check the service states. I think at 
> this point we could fall back to the original code here, and try to find the 
> active RM by iterating through them. 
> The issue only happens in HA mode, and only if we use "{_}-am{_}" option, 
> without this option the AM logs can be retrieved together with the aggregated 
> logs.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

---------------------------------------------------------------------
To unsubscribe, e-mail: yarn-issues-unsubscr...@hadoop.apache.org
For additional commands, e-mail: yarn-issues-h...@hadoop.apache.org

Reply via email to