Neunhoef added a comment. > Assuming we're OK with just planning for large server deployments: Does the > memory requirement scale linearly with the size of the data? How does that > play with sharding and replication? How large are the largest ArangoDB > clusters?
The memory requirements for the data files will scale linearly with the actual data size. The memory requirements for the shapes will probably even scale sub-linearly with the actual data size, this is what I observed today and it sounds reasonable, since later data sets will be able to reuse existing shapes from earlier data sets. The memory requirements for each index scales essentially linearly with the data but not continuously so, for example the memory requirement of a hash table jumps when it has to rehash. Skip lists do not show this behaviour. Replication simply replays the actions of one server on another one, the memory usage will be identical on each machine. Sharding distributes the data to different machines, which will all index their part only. ArangoDB does not have actual global indexes for a sharded collection. Queries are run against each locall index on each shard and the results are merged. We do not know what the actual largest deployments of ArangoDB are. However, scalability will crucially depend on the queries that actuallly hit the database. For example simply finding all documents with a specfied value or range in one indexed field is trivially shardable and will be efficient with huge clusters. Certain unfortunate joins can be more problematic. For example graph traversals in a graph that is sharded in an unfavourable way can be a disaster. TASK DETAIL https://phabricator.wikimedia.org/T88549 REPLY HANDLER ACTIONS Reply to comment or attach files, or !close, !claim, !unsubscribe or !assign <username>. EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Smalyshev, Neunhoef Cc: Neunhoef, Fceller, JanZerebecki, Aklapper, Manybubbles, jkroll, Smalyshev, Wikidata-bugs, aude, GWicke, daniel _______________________________________________ Wikidata-bugs mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
