[
https://issues.apache.org/jira/browse/YARN-10348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Jim Brennan updated YARN-10348:
-------------------------------
Attachment: YARN-10348-branch-3.2.001.patch
> Allow RM to always cancel tokens after app completes
> ----------------------------------------------------
>
> Key: YARN-10348
> URL: https://issues.apache.org/jira/browse/YARN-10348
> Project: Hadoop YARN
> Issue Type: Bug
> Components: yarn
> Affects Versions: 2.10.0, 3.1.3
> Reporter: Jim Brennan
> Assignee: Jim Brennan
> Priority: Major
> Fix For: 3.4.0, 3.3.1
>
> Attachments: YARN-10348-branch-3.2.001.patch, YARN-10348.001.patch,
> YARN-10348.002.patch
>
>
> (Note: this change was originally done on our internal branch by [~daryn]).
> The RM currently has an option for a client to specify disabling token
> cancellation when a job completes. This feature was an initial attempt to
> address the use case of a job launching sub-jobs (ie. oozie launcher) and the
> original job finishing prior to the sub-job(s) completion - ex. original job
> completion triggered premature cancellation of tokens needed by the sub-jobs.
> Many years ago, [~daryn] added a more robust implementation to ref count
> tokens ([YARN-3055]). This prevented premature cancellation of the token
> until all apps using the token complete, and invalidated the need for a
> client to specify cancel=false. Unfortunately the config option was not
> removed.
> We have seen cases where oozie "java actions" and some users were explicitly
> disabling token cancellation. This can lead to a buildup of defunct tokens
> that may overwhelm the ZK buffer used by the KDC's backing store. At which
> point the KMS fails to connect to ZK and is unable to issue/validate new
> tokens - rendering the KDC only able to authenticate pre-existing tokens.
> Production incidents have occurred due to the buffer size issue.
> To avoid these issues, the RM should have the option to ignore/override the
> client's request to not cancel tokens.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]