[ 
https://issues.apache.org/jira/browse/YARN-7262?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Robert Kanter updated YARN-7262:
--------------------------------
    Attachment: YARN-7262.001.patch

The patch adds the ability to configure a hierarchy like that in YARN-2962.  I 
generalized and reused code from YARN-2962 when possible; otherwise, I tried to 
mirror the YARN-2962 code.  There are two big differences:
# The app znodes in YARN-2962 had children (for app attempts), which we don't 
have to worry about here because delegation token znodes don't have children.
# YARN-2962 adds an extra level named "HIERARCHIES" that doesn't seem to be 
necessary.  The token znode path is already quite long, so I omitted that.  The 
layout looks like this:
{noformat}
 * |--- RM_DT_SECRET_MANAGER_ROOT
 *        |----- RM_DT_SEQUENTIAL_NUMBER_ZNODE_NAME
 *        |----- RM_DELEGATION_TOKENS_ROOT_ZNODE_NAME
 *        |       |----- 1
 *        |       |      |----- (#TokenId barring last character)
 *        |       |      |       |----- (#Last character of TokenId)
 *        |       |      ....
 *        |       |----- 2
 *        |       |      |----- (#TokenId barring last 2 characters)
 *        |       |      |       |----- (#Last 2 characters of TokenId)
 *        |       |      ....
 *        |       |----- 3
 *        |       |      |----- (#TokenId barring last 3 characters)
 *        |       |      |       |----- (#Last 3 characters of TokenId)
 *        |       |      ....
 *        |       |----- 4
 *        |       |      |----- (#TokenId barring last 4 characters)
 *        |       |      |       |----- (#Last 4 characters of TokenId)
 *        |       |      ....
 *        |       |----- Token_1
 *        |       |----- Token_2
 *        |       ....
{noformat}
YARN-2962 had "HIERARCHIES" next to "Token_#" with "1", "2", "3", and "4" under 
it.  Here, we just put "1", "2", "3", and "4" next to "Token_#".

Some more useful info about the patch:
- The default behavior is to use a flat layout, like before.  
{{yarn.resourcemanager.zk-delegation-token-node.split-index}} can be set to 
{{0}}, {{1}}, {{2}}, {{3}}, or {{4}} to split on the last 1, 2, 3, or 4 digits 
of the token sequence number.
- Token sequence numbers start at {{0}} and have a variable width, unlike 
Application IDs which have a width of 4, so when naming their znodes, the code 
pads them to at least 4 digits.  For example, {{RMDelegationToken_5}} becomes 
{{RMDelegationToken_0005}}.  This ensures that the index splitting works 
correctly.  The exception to this is when using a flat layout so we maintain 
the names as before.
- When looking for a delegation token znode, it will first try with the current 
value of {{yarn.resourcemanager.zk-delegation-token-node.split-index}}, but it 
will fallback to looking at the other possible znode paths in case the token 
was created when {{yarn.resourcemanager.zk-delegation-token-node.split-index}} 
had been set to a different value.  This ensures we don't lose any tokens when 
{{yarn.resourcemanager.zk-delegation-token-node.split-index}} changes.
- I haven't had a chance to try it out in an actual cluster yet, but there are 
unit tests that show it working correctly.  In the meantime, we can still start 
reviews.

> Add a hierarchy into the ZKRMStateStore for delegation token znodes to 
> prevent jute buffer overflow
> ---------------------------------------------------------------------------------------------------
>
>                 Key: YARN-7262
>                 URL: https://issues.apache.org/jira/browse/YARN-7262
>             Project: Hadoop YARN
>          Issue Type: Improvement
>    Affects Versions: 2.6.0
>            Reporter: Robert Kanter
>            Assignee: Robert Kanter
>         Attachments: YARN-7262.001.patch
>
>
> We've seen users who are running into a problem where the RM is storing so 
> many delegation tokens in the {{ZKRMStateStore}} that the _listing_ of those 
> znodes is higher than the jute buffer. This is fine during operations, but 
> becomes a problem on a fail over because the RM will try to read in all of 
> the token znodes (i.e. call {{getChildren}} on the parent znode).  This is 
> particularly bad because everything appears to be okay, but then if a 
> failover occurs you end up with no active RMs.
> There was a similar problem with the Yarn application data that was fixed in 
> YARN-2962 by adding a (configurable) hierarchy of znodes so the RM could pull 
> subchildren without overflowing the jute buffer (though it's off by default).
> We should add a hierarchy similar to that of YARN-2962, but for the 
> delegation token znodes.



--
This message was sent by Atlassian JIRA
(v6.4.14#64029)

---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to