| jcrespo added a comment. |
Storage is not a problem. I wonder what is the impact in IO activity (write QPS). Could we separate usage tracking to a different set of servers? This table(s) are probably very dynamic, but also probably not 100% in sync with the content edits (handled on asynchronous jobs), and the issue is not the storage they take on a single database- the main issue is that we have, for example, 20 enwiki slaves, which means every single one of them have to continuously overwrite these tables. Separating them allows to have dedicated resources for main metadata vs. statistics, each one being individually faster.
I wonder if they could be on a separate set of servers with less slaves to avoid unnecessary write IO amplification. Separating things vertically can help with replication health, and whenever it is necessary (like analytics slaves) we can use multi-source replication to consolidate all of them in a single server.
Cc: Halfak, jcrespo, TomT0m, Hall1467, hoo, zhuyifei1999, Eloquence, Lydia_Pintscher, Sannita, Ainali, Liuxinyu970226, MZMcBride, Ricordisamoa, Micru, jayvdb, Daniel_Mietchen, Tobi_WMDE_SW, Legoktm, Abraham, Wikidata-bugs, liangent, jeremyb, aude, Candalua, Bianjiang, Aklapper, DixonD, daniel, D3r1ck01, Izno, Mbch331
_______________________________________________ Wikidata-bugs mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
