[ https://issues.apache.org/jira/browse/YARN-3779?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14581712#comment-14581712 ]
Varun Saxena commented on YARN-3779: ------------------------------------ [~xgong], after applying the patch, debug log on refreshing log retention setting is something as under. I will update both success and error logs too, a little while later. {noformat} 2015-06-11 14:49:56,973 DEBUG org.apache.hadoop.ipc.Server: Socket Reader #1 for port 10033: responding to null from 10.19.92.82:30295 Call#-33 Retry#-1 Wrote 22 bytes. 2015-06-11 14:49:56,981 DEBUG org.apache.hadoop.ipc.Server: got #-3 2015-06-11 14:49:57,014 DEBUG org.apache.hadoop.ipc.Server: Successfully authorized userInfo { effectiveUser: "hdfs/hua...@hadoop.com" } protocol: "org.apache.hadoop.mapreduce.v2.api.HSAdminRefreshProtocol" 2015-06-11 14:49:57,014 DEBUG org.apache.hadoop.ipc.Server: got #0 2015-06-11 14:49:57,015 DEBUG org.apache.hadoop.ipc.Server: IPC Server handler 0 on 10033: org.apache.hadoop.mapreduce.v2.api.HSAdminRefreshProtocol.refreshLogRetentionSettings from 10.19.92.82:30295 Call#0 Retry#0 for RpcKind RPC_PROTOCOL_BUFFER 2015-06-11 14:49:57,016 DEBUG org.apache.hadoop.security.UserGroupInformation: PrivilegedAction as:hdfs/hua...@hadoop.com (auth:KERBEROS) from:org.apache.hadoop.ipc.Server$Handler.run(Server.java:2082) 2015-06-11 14:49:57,027 INFO org.apache.hadoop.mapreduce.v2.hs.server.HSAdminServer: HS Admin: refreshLogRetentionSettings invoked by user hdfs 2015-06-11 14:49:57,027 DEBUG org.apache.hadoop.ipc.Client: stopping client from cache: org.apache.hadoop.ipc.Client@2dfaea86 2015-06-11 14:49:57,079 DEBUG org.apache.hadoop.security.UserGroupInformation: PrivilegedAction as:hdfs/hua...@hadoop.com (auth:KERBEROS) from:org.apache.hadoop.yarn.client.RMProxy.getProxy(RMProxy.java:136) 2015-06-11 14:49:57,079 DEBUG org.apache.hadoop.yarn.ipc.YarnRPC: Creating YarnRPC for org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC 2015-06-11 14:49:57,079 DEBUG org.apache.hadoop.yarn.ipc.HadoopYarnProtoRPC: Creating a HadoopYarnProtoRpc proxy for protocol interface org.apache.hadoop.yarn.api.ApplicationClientProtocol 2015-06-11 14:49:57,080 DEBUG org.apache.hadoop.ipc.Client: getting client out of cache: org.apache.hadoop.ipc.Client@2dfaea86 2015-06-11 14:49:57,081 INFO org.apache.hadoop.yarn.logaggregation.AggregatedLogDeletionService: aggregated log deletion started. 2015-06-11 14:49:57,081 INFO org.apache.hadoop.mapreduce.v2.hs.HSAuditLogger: USER=hdfs IP=10.19.92.82 OPERATION=refreshLogRetentionSettings TARGET=HSAdminServer RESULT=SUCCESS 2015-06-11 14:49:57,081 DEBUG org.apache.hadoop.security.UserGroupInformation: PrivilegedAction as:hdfs/hua...@hadoop.com (auth:KERBEROS) from:org.apache.hadoop.yarn.logaggregation.AggregatedLogDeletionService$LogDeletionTask.run(AggregatedLogDeletionService.java:83) 2015-06-11 14:49:57,081 DEBUG org.apache.hadoop.ipc.Server: Served: refreshLogRetentionSettings queueTime= 11 procesingTime= 55 2015-06-11 14:49:57,082 DEBUG org.apache.hadoop.ipc.Server: IPC Server handler 0 on 10033: responding to org.apache.hadoop.mapreduce.v2.api.HSAdminRefreshProtocol.refreshLogRetentionSettings from 10.19.92.82:30295 Call#0 Retry#0 2015-06-11 14:49:57,083 DEBUG org.apache.hadoop.ipc.Server: IPC Server handler 0 on 10033: responding to org.apache.hadoop.mapreduce.v2.api.HSAdminRefreshProtocol.refreshLogRetentionSettings from 10.19.92.82:30295 Call#0 Retry#0 Wrote 32 bytes. 2015-06-11 14:49:57,083 DEBUG org.apache.hadoop.ipc.Client: IPC Client (889891977) connection to /10.19.92.82:65110 from hdfs/hua...@hadoop.com sending #5 2015-06-11 14:49:57,084 DEBUG org.apache.hadoop.ipc.Client: IPC Client (889891977) connection to /10.19.92.82:65110 from hdfs/hua...@hadoop.com got value #5 2015-06-11 14:49:57,084 DEBUG org.apache.hadoop.ipc.ProtobufRpcEngine: Call: getListing took 1ms 2015-06-11 14:49:57,085 INFO org.apache.hadoop.yarn.logaggregation.AggregatedLogDeletionService: aggregated log deletion finished. {noformat} > Aggregated Logs Deletion doesnt work after refreshing Log Retention Settings > in secure cluster > ---------------------------------------------------------------------------------------------- > > Key: YARN-3779 > URL: https://issues.apache.org/jira/browse/YARN-3779 > Project: Hadoop YARN > Issue Type: Bug > Affects Versions: 2.7.0 > Environment: mrV2, secure mode > Reporter: Zhang Wei > Assignee: Varun Saxena > Priority: Critical > Attachments: YARN-3779.01.patch, YARN-3779.02.patch > > > {{GSSException}} is thrown everytime log aggregation deletion is attempted > after executing bin/mapred hsadmin -refreshLogRetentionSettings in a secure > cluster. > The problem can be reproduced by following steps: > 1. startup historyserver in secure cluster. > 2. Log deletion happens as per expectation. > 3. execute {{mapred hsadmin -refreshLogRetentionSettings}} command to refresh > the configuration value. > 4. All the subsequent attempts of log deletion fail with {{GSSException}} > Following exception can be found in historyserver's log if log deletion is > enabled. > {noformat} > 2015-06-04 14:14:40,070 | ERROR | Timer-3 | Error reading root log dir this > deletion attempt is being aborted | AggregatedLogDeletionService.java:127 > java.io.IOException: Failed on local exception: java.io.IOException: > javax.security.sasl.SaslException: GSS initiate failed [Caused by > GSSException: No valid credentials provided (Mechanism level: Failed to find > any Kerberos tgt)]; Host Details : local host is: "vm-31/9.91.12.31"; > destination host is: "vm-33":25000; > at org.apache.hadoop.net.NetUtils.wrapException(NetUtils.java:764) > at org.apache.hadoop.ipc.Client.call(Client.java:1414) > at org.apache.hadoop.ipc.Client.call(Client.java:1363) > at > org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206) > at com.sun.proxy.$Proxy9.getListing(Unknown Source) > at > org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getListing(ClientNamenodeProtocolTranslatorPB.java:519) > at sun.reflect.GeneratedMethodAccessor16.invoke(Unknown Source) > at > sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43) > at java.lang.reflect.Method.invoke(Method.java:606) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187) > at > org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102) > at com.sun.proxy.$Proxy10.getListing(Unknown Source) > at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1767) > at org.apache.hadoop.hdfs.DFSClient.listPaths(DFSClient.java:1750) > at > org.apache.hadoop.hdfs.DistributedFileSystem.listStatusInternal(DistributedFileSystem.java:691) > at > org.apache.hadoop.hdfs.DistributedFileSystem.access$600(DistributedFileSystem.java:102) > at > org.apache.hadoop.hdfs.DistributedFileSystem$15.doCall(DistributedFileSystem.java:753) > at > org.apache.hadoop.hdfs.DistributedFileSystem$15.doCall(DistributedFileSystem.java:749) > at > org.apache.hadoop.fs.FileSystemLinkResolver.resolve(FileSystemLinkResolver.java:81) > at > org.apache.hadoop.hdfs.DistributedFileSystem.listStatus(DistributedFileSystem.java:749) > at > org.apache.hadoop.yarn.logaggregation.AggregatedLogDeletionService$LogDeletionTask.run(AggregatedLogDeletionService.java:68) > at java.util.TimerThread.mainLoop(Timer.java:555) > at java.util.TimerThread.run(Timer.java:505) > Caused by: java.io.IOException: javax.security.sasl.SaslException: GSS > initiate failed [Caused by GSSException: No valid credentials provided > (Mechanism level: Failed to find any Kerberos tgt)] > at org.apache.hadoop.ipc.Client$Connection$1.run(Client.java:677) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1641) > at > org.apache.hadoop.ipc.Client$Connection.handleSaslConnectionFailure(Client.java:640) > at > org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:724) > at > org.apache.hadoop.ipc.Client$Connection.access$2800(Client.java:367) > at org.apache.hadoop.ipc.Client.getConnection(Client.java:1462) > at org.apache.hadoop.ipc.Client.call(Client.java:1381) > ... 21 more > Caused by: javax.security.sasl.SaslException: GSS initiate failed [Caused by > GSSException: No valid credentials provided (Mechanism level: Failed to find > any Kerberos tgt)] > at > com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:212) > at > org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:411) > at > org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:550) > at > org.apache.hadoop.ipc.Client$Connection.access$1800(Client.java:367) > at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:716) > at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:712) > at java.security.AccessController.doPrivileged(Native Method) > at javax.security.auth.Subject.doAs(Subject.java:415) > at > org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1641) > at > org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:711) > ... 24 more > Caused by: GSSException: No valid credentials provided (Mechanism level: > Failed to find any Kerberos tgt) > at > sun.security.jgss.krb5.Krb5InitCredential.getInstance(Krb5InitCredential.java:147) > at > sun.security.jgss.krb5.Krb5MechFactory.getCredentialElement(Krb5MechFactory.java:121) > at > sun.security.jgss.krb5.Krb5MechFactory.getMechanismContext(Krb5MechFactory.java:187) > at > sun.security.jgss.GSSManagerImpl.getMechanismContext(GSSManagerImpl.java:223) > at > sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:212) > at > sun.security.jgss.GSSContextImpl.initSecContext(GSSContextImpl.java:179) > at > com.sun.security.sasl.gsskerb.GssKrb5Client.evaluateChallenge(GssKrb5Client.java:193) > ... 33 more > {noformat} -- This message was sent by Atlassian JIRA (v6.3.4#6332)