Document reasonable limits on the size and shape of data for a zookeeper ensemble. ----------------------------------------------------------------------------------
Key: ZOOKEEPER-580 URL: https://issues.apache.org/jira/browse/ZOOKEEPER-580 Project: Zookeeper Issue Type: Improvement Components: documentation Reporter: bryan thompson I would like to have documentation which clarifies the reasonable limits on the size and shape of data in a zookeeper ensemble. Since all zookeeper nodes and their data are replicated on each peer in an ensemble, there will be a machine limit on the amount of data in a zookeeper instance, but I have not seen any guidance on the estimation of that machine limit. Presumably the machine limits are primarily determined by the amount of heap available to the JVM before swapping sets in, however there might well be other limits which are less obvious in terms of the #of children per node and the depth of the node hierarchy (in addition to the already documented limit on the amount of data in a node). There may also be interactions with the hierarchy depth and performance, which I have not seen detailed anywhere. Guidance regarding pragmatic and machine limits would be helpful is choosing designs using zookeeper which can scale. For example, if metadata about each shard of a partitioned database architecture is mapped onto a distinct znode in zookeeper, then there could be an very large number of znodes for a large database deployment. While this would make it easy to reassign shards to services dynamically, the design might impose an unforeseen limit on the #of shards in the database. A similar concern would apply to an attempt to maintain metadata about each file in a distributed file system. Issue [ZOOKEEPER-272] described some problems when nodes have a large number #of children. However, it did not elaborate on whether the change to an Iterator model would break the atomic semantics of the List<String> of children or if the Iterator would be backed by a snapshot of the children as it existed at the time the iterator was requested, which would put a memory burden on the ensemble. This raises the related question of when designs which work around scaling limits in zookeeper might break desirable semantics, primarily the ability to have a consistent view of the distributed state. Put another way, are there anti-patterns for zookeeper relating to scalability? Too many children? Too much depth? Avoid decomposing large numbers of children into hierarchies? Etc. -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.