[ 
https://issues.apache.org/jira/browse/YARN-2962?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=14526289#comment-14526289
 ] 

Arun Suresh commented on YARN-2962:
-----------------------------------

bq. we can simplify it without special hierarchies by having RM recursively 
read nodes all the time. As we already specifically look for "application_" 
prefix to read app-date, having application_1234_0456 right next to a 00/ 
directory is simply going to work without much complexity
Hmmm.. Currently, while reading, the code expects only leaf nodes to have data. 
We could modify it to continue to child nodes while loading RMState. But 
updates to an app state would require some thought. Consider updating state of 
app Id _1000000. The update code would have to first check both the /.._1000000 
and /.._10000/00 znodes. Also, retrieving state during load_all and 
update_single might be hairy.. there can be ambiguous paths since a node path 
might not be unique across the 2 schemes. For eg. /.._10000 will exist both in 
the new and old scheme. In the old scheme it can contain data, but in the new 
scheme it shouldnt (it is an intermediate node for /.._10000/\[00-99\])..

Although option 2 can be done, I'd prefer the your first suggestion (storing 
under RM_APP_ROOT/hierarchies). We can have the RM read the old style but new 
apps and updates to old apps will go under the new root. We can even delete the 
old scheme root if no children exist.




> ZKRMStateStore: Limit the number of znodes under a znode
> --------------------------------------------------------
>
>                 Key: YARN-2962
>                 URL: https://issues.apache.org/jira/browse/YARN-2962
>             Project: Hadoop YARN
>          Issue Type: Improvement
>          Components: resourcemanager
>    Affects Versions: 2.6.0
>            Reporter: Karthik Kambatla
>            Assignee: Varun Saxena
>            Priority: Critical
>         Attachments: YARN-2962.01.patch, YARN-2962.2.patch, YARN-2962.3.patch
>
>
> We ran into this issue where we were hitting the default ZK server message 
> size configs, primarily because the message had too many znodes even though 
> they individually they were all small.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to