[ 
https://issues.apache.org/jira/browse/YARN-2962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15085977#comment-15085977
 ] 

Varun Saxena commented on YARN-2962:
------------------------------------

Rebased and updated the patch.
Additionally the patch ensures that changes in split index config does not lead 
to formatting of the state store.
The patch primarily adopts the suggestion given by [~vinodkv] above.
The storage scheme would look something like below.

{noformat}
  |--- RM_APP_ROOT
  |     |----- HIERARCHIES
  |     |        |----- 1
  |     |        |      |----- (#ApplicationId barring last character)
  |     |        |      |       |----- (#Last character of ApplicationId)
  |     |        |      |       |       |----- (#ApplicationAttemptIds)
  |     |        |      ....
  |     |        |
  |     |        |----- 2
  |     |        |      |----- (#ApplicationId barring last 2 characters)
  |     |        |      |       |----- (#Last 2 characters of ApplicationId)
  |     |        |      |       |       |----- (#ApplicationAttemptIds)
  |     |        |      ....
  |     |        |
  |     |        |----- 3
  |     |        |      |----- (#ApplicationId barring last 3 characters)
  |     |        |      |       |----- (#Last 3 characters of ApplicationId)
  |     |        |      |       |       |----- (#ApplicationAttemptIds)
  |     |        |      ....
  |     |        |
  |     |        |----- 4
  |     |        |      |----- (#ApplicationId barring last 4 characters)
  |     |        |      |       |----- (#Last 4 characters of ApplicationId)
  |     |        |      |       |       |----- (#ApplicationAttemptIds)
  |     |        |      ....
  |     |        |
  |     |----- (#ApplicationId1)
  |     |        |----- (#ApplicationAttemptIds)
  |     |
  |     |----- (#ApplicationId2)
  |     |       |----- (#ApplicationAttemptIds)
  |     ....
  |
{noformat}

Split index will be calculated from the end.
Apps will be stored outside HIERARCHIES folder(i.e. directly under RMAppRoot) 
if the split index config value is 0(i.e. no split) - default value. This has 
been done so that if users do not want to split app nodes, there will be no 
impact on them during an upgrade.

If app node is not found as in the folder as per configured split index, we 
will look into other paths. 
At the time of startup, we will include only those app hierarchies which have 
apps under them and the hierarchy as per configured split index. This would 
preclude the need to look in each and every path(as per split index) in case 
app znode is not found in path as per configured split index.

_Example :_ With no split, appid znode will be of the stored as 
{{RMAppRoot/application_1352994193343_0001}}. If the value of this config is 1, 
the appid znode will be broken into two parts application_1352994193343_000 and 
1 respectively with former being the parent node.
It will be stored in path 
{{RMAppRoot/HIERARCHIES/1/application_1352994193343_000/1}} i.e. upto 10 apps 
can be stored under the parent.
If config was 2, it will be stored in path 
{{RMAppRoot/HIERARCHIES/2/application_1352994193343_00/01}} i.e. upto 100 apps 
can be stored under the parent.
Likewise, upto 1000 apps can be stored under a parent if config is 3 and 10000 
apps if config is 4.

We remove the parent app path if no apps exist under the parent upon removal.

As the ZKRMStateStore methods are synchronized I am assuming that there will be 
no race when deleting the parent path above. i.e. Race while deleting the app 
node parent and a new app being stored under the same parent. I hope that is a 
fair assumption assuming only one RM will be active at a time. And only one RM 
should be up in non HA mode. 
Do we need to take care of something else here ?
 

> ZKRMStateStore: Limit the number of znodes under a znode
> --------------------------------------------------------
>
>                 Key: YARN-2962
>                 URL: https://issues.apache.org/jira/browse/YARN-2962
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: resourcemanager
>    Affects Versions: 2.6.0
>            Reporter: Karthik Kambatla
>            Assignee: Varun Saxena
>            Priority: Critical
>         Attachments: YARN-2962.01.patch, YARN-2962.04.patch, 
> YARN-2962.2.patch, YARN-2962.3.patch
>
>
> We ran into this issue where we were hitting the default ZK server message 
> size configs, primarily because the message had too many znodes even though 
> they individually they were all small.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to