[
https://issues.apache.org/jira/browse/YARN-5098?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Yesha Vora updated YARN-5098:
-----------------------------
Description:
Scenario (HA-env):
* set dfs.namenode.delegation.token.max-lifetime=43200000 and
dfs.namenode.delegation.token.renew-interval=28800000
* Start long running applications
* Let these applications run for ~3 days
* After 3 days, kill Kill the applications
* try to get application logs for above long running apps.
However, Yarn application logs for long running application could not be
gathered because Nodemanager failed to talk to HDFS with below error.
{code}
2016-05-16 18:18:28,533 INFO logaggregation.AppLogAggregatorImpl
(AppLogAggregatorImpl.java:finishLogAggregation(555)) - Application just
finished : application_1463170334122_0002
2016-05-16 18:18:28,545 WARN ipc.Client (Client.java:run(705)) - Exception
encountered while connecting to the server :
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
token (HDFS_DELEGATION_TOKEN token 171 for hrt_qa) can't be found in cache
at
org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:375)
at
org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:583)
at org.apache.hadoop.ipc.Client$Connection.access$1900(Client.java:398)
at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:752)
at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:748)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1719)
at
org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:747)
at org.apache.hadoop.ipc.Client$Connection.access$3100(Client.java:398)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1597)
at org.apache.hadoop.ipc.Client.call(Client.java:1439)
at org.apache.hadoop.ipc.Client.call(Client.java:1386)
at
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:240)
at com.sun.proxy.$Proxy83.getServerDefaults(Unknown Source)
at
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getServerDefaults(ClientNamenodeProtocolTranslatorPB.java:282)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
at com.sun.proxy.$Proxy84.getServerDefaults(Unknown Source)
at
org.apache.hadoop.hdfs.DFSClient.getServerDefaults(DFSClient.java:1018)
at org.apache.hadoop.fs.Hdfs.getServerDefaults(Hdfs.java:156)
at
org.apache.hadoop.fs.AbstractFileSystem.create(AbstractFileSystem.java:550)
at org.apache.hadoop.fs.FileContext$3.next(FileContext.java:687)
{code}
was:
Scenario:
* set dfs.namenode.delegation.token.max-lifetime=43200000 and
dfs.namenode.delegation.token.renew-interval=28800000
* Start long running applications
* Let these applications run for ~3 days
* After 3 days, kill Kill the applications
* try to get application logs for above long running apps.
However, Yarn application logs for long running application could not be
gathered because Nodemanager failed to talk to HDFS with below error.
{code}
2016-05-16 18:18:28,533 INFO logaggregation.AppLogAggregatorImpl
(AppLogAggregatorImpl.java:finishLogAggregation(555)) - Application just
finished : application_1463170334122_0002
2016-05-16 18:18:28,545 WARN ipc.Client (Client.java:run(705)) - Exception
encountered while connecting to the server :
org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
token (HDFS_DELEGATION_TOKEN token 171 for hrt_qa) can't be found in cache
at
org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:375)
at
org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:583)
at org.apache.hadoop.ipc.Client$Connection.access$1900(Client.java:398)
at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:752)
at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:748)
at java.security.AccessController.doPrivileged(Native Method)
at javax.security.auth.Subject.doAs(Subject.java:422)
at
org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1719)
at
org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:747)
at org.apache.hadoop.ipc.Client$Connection.access$3100(Client.java:398)
at org.apache.hadoop.ipc.Client.getConnection(Client.java:1597)
at org.apache.hadoop.ipc.Client.call(Client.java:1439)
at org.apache.hadoop.ipc.Client.call(Client.java:1386)
at
org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:240)
at com.sun.proxy.$Proxy83.getServerDefaults(Unknown Source)
at
org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getServerDefaults(ClientNamenodeProtocolTranslatorPB.java:282)
at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
at
sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
at
sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
at java.lang.reflect.Method.invoke(Method.java:498)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256)
at
org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
at com.sun.proxy.$Proxy84.getServerDefaults(Unknown Source)
at
org.apache.hadoop.hdfs.DFSClient.getServerDefaults(DFSClient.java:1018)
at org.apache.hadoop.fs.Hdfs.getServerDefaults(Hdfs.java:156)
at
org.apache.hadoop.fs.AbstractFileSystem.create(AbstractFileSystem.java:550)
at org.apache.hadoop.fs.FileContext$3.next(FileContext.java:687)
{code}
> Yarn Application log Aggreagation fails due to NM can not get correct HDFS
> delegation token
> -------------------------------------------------------------------------------------------
>
> Key: YARN-5098
> URL: https://issues.apache.org/jira/browse/YARN-5098
> Project: Hadoop YARN
> Issue Type: Bug
> Components: yarn
> Reporter: Yesha Vora
>
> Scenario (HA-env):
> * set dfs.namenode.delegation.token.max-lifetime=43200000 and
> dfs.namenode.delegation.token.renew-interval=28800000
> * Start long running applications
> * Let these applications run for ~3 days
> * After 3 days, kill Kill the applications
> * try to get application logs for above long running apps.
> However, Yarn application logs for long running application could not be
> gathered because Nodemanager failed to talk to HDFS with below error.
> {code}
> 2016-05-16 18:18:28,533 INFO logaggregation.AppLogAggregatorImpl
> (AppLogAggregatorImpl.java:finishLogAggregation(555)) - Application just
> finished : application_1463170334122_0002
> 2016-05-16 18:18:28,545 WARN ipc.Client (Client.java:run(705)) - Exception
> encountered while connecting to the server :
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
> token (HDFS_DELEGATION_TOKEN token 171 for hrt_qa) can't be found in cache
> at
> org.apache.hadoop.security.SaslRpcClient.saslConnect(SaslRpcClient.java:375)
> at
> org.apache.hadoop.ipc.Client$Connection.setupSaslConnection(Client.java:583)
> at
> org.apache.hadoop.ipc.Client$Connection.access$1900(Client.java:398)
> at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:752)
> at org.apache.hadoop.ipc.Client$Connection$2.run(Client.java:748)
> at java.security.AccessController.doPrivileged(Native Method)
> at javax.security.auth.Subject.doAs(Subject.java:422)
> at
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1719)
> at
> org.apache.hadoop.ipc.Client$Connection.setupIOstreams(Client.java:747)
> at
> org.apache.hadoop.ipc.Client$Connection.access$3100(Client.java:398)
> at org.apache.hadoop.ipc.Client.getConnection(Client.java:1597)
> at org.apache.hadoop.ipc.Client.call(Client.java:1439)
> at org.apache.hadoop.ipc.Client.call(Client.java:1386)
> at
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:240)
> at com.sun.proxy.$Proxy83.getServerDefaults(Unknown Source)
> at
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getServerDefaults(ClientNamenodeProtocolTranslatorPB.java:282)
> at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
> at
> sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:62)
> at
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
> at java.lang.reflect.Method.invoke(Method.java:498)
> at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:256)
> at
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:104)
> at com.sun.proxy.$Proxy84.getServerDefaults(Unknown Source)
> at
> org.apache.hadoop.hdfs.DFSClient.getServerDefaults(DFSClient.java:1018)
> at org.apache.hadoop.fs.Hdfs.getServerDefaults(Hdfs.java:156)
> at
> org.apache.hadoop.fs.AbstractFileSystem.create(AbstractFileSystem.java:550)
> at org.apache.hadoop.fs.FileContext$3.next(FileContext.java:687)
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]