[ 
https://issues.apache.org/jira/browse/YARN-3190?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14368688#comment-14368688
 ] 

zhihai xu commented on YARN-3190:
---------------------------------

[~dubislv],
Could you try the following work around to see whether it can solve your issue?
set the following property for all Oozie jobs(job client):
<property>
<name>mapreduce.job.complete.cancel.delegation.tokens</name>
<value>false</value>
</property>

> NM can't aggregate logs: token  can't be found in cache
> -------------------------------------------------------
>
>                 Key: YARN-3190
>                 URL: https://issues.apache.org/jira/browse/YARN-3190
>             Project: Hadoop YARN
>          Issue Type: Bug
>          Components: nodemanager
>    Affects Versions: 2.5.0
>         Environment: CDH 5.3.1
> HA HDFS
> Kerberos
>            Reporter: Andrejs Dubovskis
>            Priority: Minor
>
> In rare cases node manager can not aggregate logs: generating exception:
> {code}
> 2015-02-12 13:04:03,703 INFO 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl:
>  Starting aggregate log-file for app application_1423661043235_2150 at 
> /tmp/logs/catalyst/logs/application_1423661043235_2150/catdn001.intrum.net_8041.tmp
> 2015-02-12 13:04:03,707 INFO 
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor: Deleting 
> absolute path : 
> /data5/yarn/nm/usercache/catalyst/appcache/application_1423661043235_2150/container_1423661043235_2150_01_000442
> 2015-02-12 13:04:03,707 INFO 
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor: Deleting 
> absolute path : 
> /data6/yarn/nm/usercache/catalyst/appcache/application_1423661043235_2150/container_1423661043235_2150_01_000442
> 2015-02-12 13:04:03,707 INFO 
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor: Deleting 
> absolute path : 
> /data7/yarn/nm/usercache/catalyst/appcache/application_1423661043235_2150/container_1423661043235_2150_01_000442
> 2015-02-12 13:04:03,709 INFO 
> org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor: Deleting 
> absolute path : 
> /data1/yarn/nm/usercache/catalyst/appcache/application_1423661043235_2150
> 2015-02-12 13:04:03,709 WARN org.apache.hadoop.security.UserGroupInformation: 
> PriviledgedActionException as:catalyst (auth:SIMPLE) 
> cause:org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
>  token (HDFS_DELEGATION_TOKEN token 2334644 for catalyst) can't be found in 
> cache
> 2015-02-12 13:04:03,709 WARN org.apache.hadoop.ipc.Client: Exception 
> encountered while connecting to the server : 
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
>  token (HDFS_DELEGATION_TOKEN token 2334644 for catalyst) can't be found in 
> cache
> 2015-02-12 13:04:03,709 WARN org.apache.hadoop.security.UserGroupInformation: 
> PriviledgedActionException as:catalyst (auth:SIMPLE) 
> cause:org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
>  token (HDFS_DELEGATION_TOKEN token 2334644 for catalyst) can't be found in 
> cache
> 2015-02-12 13:04:03,712 WARN org.apache.hadoop.security.UserGroupInformation: 
> PriviledgedActionException as:catalyst (auth:SIMPLE) 
> cause:org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
>  token (HDFS_DELEGATION_TOKEN token 2334644 for catalyst) can't be found in 
> cache
> 2015-02-12 13:04:03,712 ERROR 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl:
>  Cannot create writer for app application_1423661043235_2150. Disabling 
> log-aggregation for this app.
> org.apache.hadoop.ipc.RemoteException(org.apache.hadoop.security.token.SecretManager$InvalidToken):
>  token (HDFS_DELEGATION_TOKEN token 2334644 for catalyst) can't be found in 
> cache
>         at org.apache.hadoop.ipc.Client.call(Client.java:1411)
>         at org.apache.hadoop.ipc.Client.call(Client.java:1364)
>         at 
> org.apache.hadoop.ipc.ProtobufRpcEngine$Invoker.invoke(ProtobufRpcEngine.java:206)
>         at com.sun.proxy.$Proxy19.getServerDefaults(Unknown Source)
>         at 
> org.apache.hadoop.hdfs.protocolPB.ClientNamenodeProtocolTranslatorPB.getServerDefaults(ClientNamenodeProtocolTranslatorPB.java:259)
>         at sun.reflect.GeneratedMethodAccessor114.invoke(Unknown Source)
>         at 
> sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
>         at java.lang.reflect.Method.invoke(Method.java:606)
>         at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invokeMethod(RetryInvocationHandler.java:187)
>         at 
> org.apache.hadoop.io.retry.RetryInvocationHandler.invoke(RetryInvocationHandler.java:102)
>         at com.sun.proxy.$Proxy20.getServerDefaults(Unknown Source)
>         at 
> org.apache.hadoop.hdfs.DFSClient.getServerDefaults(DFSClient.java:966)
>         at org.apache.hadoop.fs.Hdfs.getServerDefaults(Hdfs.java:159)
>         at 
> org.apache.hadoop.fs.AbstractFileSystem.create(AbstractFileSystem.java:543)
>         at org.apache.hadoop.fs.FileContext$3.next(FileContext.java:680)
>         at org.apache.hadoop.fs.FileContext$3.next(FileContext.java:676)
>         at org.apache.hadoop.fs.FSLinkResolver.resolve(FSLinkResolver.java:90)
>         at org.apache.hadoop.fs.FileContext.create(FileContext.java:676)
>         at 
> org.apache.hadoop.yarn.logaggregation.AggregatedLogFormat$LogWriter$1.run(AggregatedLogFormat.java:272)
>         at 
> org.apache.hadoop.yarn.logaggregation.AggregatedLogFormat$LogWriter$1.run(AggregatedLogFormat.java:267)
>         at java.security.AccessController.doPrivileged(Native Method)
>         at javax.security.auth.Subject.doAs(Subject.java:415)
>         at 
> org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1642)
>         at 
> org.apache.hadoop.yarn.logaggregation.AggregatedLogFormat$LogWriter.<init>(AggregatedLogFormat.java:266)
>         at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.uploadLogsForContainer(AppLogAggregatorImpl.java:134)
>         at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.doAppLogAggregation(AppLogAggregatorImpl.java:196)
>         at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.AppLogAggregatorImpl.run(AppLogAggregatorImpl.java:166)
>         at 
> org.apache.hadoop.yarn.server.nodemanager.containermanager.logaggregation.LogAggregationService$2.run(LogAggregationService.java:372)
>         at 
> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1145)
>         at 
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:615)
>         at java.lang.Thread.run(Thread.java:745)
> {code}



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to