[
https://issues.apache.org/jira/browse/YARN-7262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Robert Kanter updated YARN-7262:
--------------------------------
Attachment: YARN-7262.001.patch
The patch adds the ability to configure a hierarchy like that in YARN-2962. I
generalized and reused code from YARN-2962 when possible; otherwise, I tried to
mirror the YARN-2962 code. There are two big differences:
# The app znodes in YARN-2962 had children (for app attempts), which we don't
have to worry about here because delegation token znodes don't have children.
# YARN-2962 adds an extra level named "HIERARCHIES" that doesn't seem to be
necessary. The token znode path is already quite long, so I omitted that. The
layout looks like this:
{noformat}
* |--- RM_DT_SECRET_MANAGER_ROOT
* |----- RM_DT_SEQUENTIAL_NUMBER_ZNODE_NAME
* |----- RM_DELEGATION_TOKENS_ROOT_ZNODE_NAME
* | |----- 1
* | | |----- (#TokenId barring last character)
* | | | |----- (#Last character of TokenId)
* | | ....
* | |----- 2
* | | |----- (#TokenId barring last 2 characters)
* | | | |----- (#Last 2 characters of TokenId)
* | | ....
* | |----- 3
* | | |----- (#TokenId barring last 3 characters)
* | | | |----- (#Last 3 characters of TokenId)
* | | ....
* | |----- 4
* | | |----- (#TokenId barring last 4 characters)
* | | | |----- (#Last 4 characters of TokenId)
* | | ....
* | |----- Token_1
* | |----- Token_2
* | ....
{noformat}
YARN-2962 had "HIERARCHIES" next to "Token_#" with "1", "2", "3", and "4" under
it. Here, we just put "1", "2", "3", and "4" next to "Token_#".
Some more useful info about the patch:
- The default behavior is to use a flat layout, like before.
{{yarn.resourcemanager.zk-delegation-token-node.split-index}} can be set to
{{0}}, {{1}}, {{2}}, {{3}}, or {{4}} to split on the last 1, 2, 3, or 4 digits
of the token sequence number.
- Token sequence numbers start at {{0}} and have a variable width, unlike
Application IDs which have a width of 4, so when naming their znodes, the code
pads them to at least 4 digits. For example, {{RMDelegationToken_5}} becomes
{{RMDelegationToken_0005}}. This ensures that the index splitting works
correctly. The exception to this is when using a flat layout so we maintain
the names as before.
- When looking for a delegation token znode, it will first try with the current
value of {{yarn.resourcemanager.zk-delegation-token-node.split-index}}, but it
will fallback to looking at the other possible znode paths in case the token
was created when {{yarn.resourcemanager.zk-delegation-token-node.split-index}}
had been set to a different value. This ensures we don't lose any tokens when
{{yarn.resourcemanager.zk-delegation-token-node.split-index}} changes.
- I haven't had a chance to try it out in an actual cluster yet, but there are
unit tests that show it working correctly. In the meantime, we can still start
reviews.
> Add a hierarchy into the ZKRMStateStore for delegation token znodes to
> prevent jute buffer overflow
> ---------------------------------------------------------------------------------------------------
>
> Key: YARN-7262
> URL: https://issues.apache.org/jira/browse/YARN-7262
> Project: Hadoop YARN
> Issue Type: Improvement
> Affects Versions: 2.6.0
> Reporter: Robert Kanter
> Assignee: Robert Kanter
> Attachments: YARN-7262.001.patch
>
>
> We've seen users who are running into a problem where the RM is storing so
> many delegation tokens in the {{ZKRMStateStore}} that the _listing_ of those
> znodes is higher than the jute buffer. This is fine during operations, but
> becomes a problem on a fail over because the RM will try to read in all of
> the token znodes (i.e. call {{getChildren}} on the parent znode). This is
> particularly bad because everything appears to be okay, but then if a
> failover occurs you end up with no active RMs.
> There was a similar problem with the Yarn application data that was fixed in
> YARN-2962 by adding a (configurable) hierarchy of znodes so the RM could pull
> subchildren without overflowing the jute buffer (though it's off by default).
> We should add a hierarchy similar to that of YARN-2962, but for the
> delegation token znodes.
--
This message was sent by Atlassian JIRA
(v6.4.14#64029)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]