Neunhoef added a comment.

> Assuming we're OK with just planning for large server deployments: Does the 
> memory requirement scale linearly with the size of the data?  How does that 
> play with sharding and replication?  How large are the largest ArangoDB 
> clusters?


The memory requirements for the data files will scale linearly with the actual 
data size.
The memory requirements for the shapes will probably even scale sub-linearly 
with the actual data size, this is what I observed today and it sounds 
reasonable, since later data sets will be able to reuse existing shapes from 
earlier data sets.
The memory requirements for each index scales essentially linearly with the 
data but not continuously so, for example the memory requirement of a hash 
table jumps when it has to rehash. Skip lists do not show this behaviour.
Replication simply replays the actions of one server on another one, the memory 
usage will be identical on each machine.
Sharding distributes the data to different machines, which will all index their 
part only. ArangoDB does not have actual global indexes for a sharded 
collection. Queries are run against each locall index on each shard and the 
results are merged.
We do not know what the actual largest deployments of ArangoDB are. However, 
scalability will crucially depend on the queries that actuallly hit the 
database. For example simply finding all documents with a specfied value or 
range in one indexed field is trivially shardable and will be efficient with 
huge clusters. Certain unfortunate joins can be more problematic. For example 
graph traversals in a graph that is sharded in an unfavourable way can be a 
disaster.


TASK DETAIL
  https://phabricator.wikimedia.org/T88549

REPLY HANDLER ACTIONS
  Reply to comment or attach files, or !close, !claim, !unsubscribe or !assign 
<username>.

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Smalyshev, Neunhoef
Cc: Neunhoef, Fceller, JanZerebecki, Aklapper, Manybubbles, jkroll, Smalyshev, 
Wikidata-bugs, aude, GWicke, daniel



_______________________________________________
Wikidata-bugs mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs

Reply via email to