[
https://issues.apache.org/jira/browse/YARN-5302?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15358491#comment-15358491
]
Xianyin Xin commented on YARN-5302:
-----------------------------------
Thanks varun. Maybe you are mentioning YARN-4783. But from the discussing of
YARN-4783, it seems in that case RM has canceled the token, so it is not secure
to continue to maitain a HDFS delegation token for the app. In this case, app
is still running, but RM has reqeusted a new HDFS token. Because this exception
happens duration NM recovering, RM's new token hasn't be passed to NM. The old
token is read from StateStore and cause the exception.
Sorry for insufficient information.
> Yarn Application log Aggreagation fails due to NM can not get correct HDFS
> delegation token II
> ----------------------------------------------------------------------------------------------
>
> Key: YARN-5302
> URL: https://issues.apache.org/jira/browse/YARN-5302
> Project: Hadoop YARN
> Issue Type: Bug
> Components: yarn
> Reporter: Xianyin Xin
>
> Different with YARN-5098, this happens at NM side. When NM recovers,
> credentials are read from NMStateStore. When initialize app aggregators,
> exception happens because of the overdue tokens.
> {code:title=LogAggregationService.java}
> protected void initAppAggregator(final ApplicationId appId, String user,
> Credentials credentials, ContainerLogsRetentionPolicy
> logRetentionPolicy,
> Map<ApplicationAccessType, String> appAcls,
> LogAggregationContext logAggregationContext) {
> // Get user's FileSystem credentials
> final UserGroupInformation userUgi =
> UserGroupInformation.createRemoteUser(user);
> if (credentials != null) {
> userUgi.addCredentials(credentials);
> }
> ...
> try {
> // Create the app dir
> createAppDir(user, appId, userUgi);
> } catch (Exception e) {
> appLogAggregator.disableLogAggregation();
> if (!(e instanceof YarnRuntimeException)) {
> appDirException = new YarnRuntimeException(e);
> } else {
> appDirException = (YarnRuntimeException)e;
> }
> appLogAggregators.remove(appId);
> closeFileSystems(userUgi);
> throw appDirException;
> }
> {code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]