Anomie added a comment.

Increasing the batch size to 1000 increased the time-per-transaction to about 2-3 seconds, and decreased the total time to 12447 seconds (a savings of only about 23 minutes). Time for the subsequent runs was down to about 32 seconds. MRSS stayed the same (BTW, an initial run with --reuse-content had an MRSS of 687472 K).

If the time scales linearly, the 840894447 rows on enwiki would save 52 hours (off of 21 days) with the larger batch size. Once I run things against the commonswiki data set that might give a better indication.

I think one more thing to try: batching the slot row inserts within each transaction instead of doing them individually. It'd be nice to batch the content row inserts too, but doing that sanely seems rather complicated since we need the IDs back from each row inserted.



To: Anomie
Cc: ops-monitoring-bot, Marostegui, jcrespo, Aklapper, aude, Addshore, Anomie, Jdforrester-WMF, gerritbot, Abit, daniel, Gaboe420, Versusxo, Majesticalreaper22, Giuliamocci, Adrian1985, Cpaulf30, Lahi, PDrouin-WMF, Gq86, Baloch007, E1presidente, Ramsey-WMF, Cparle, Darkminds3113, SandraF_WMF, Bsandipan, Lordiis, GoranSMilovanovic, Adik2382, Th3d3v1ls, Ramalepe, Liugev6, QZanden, Tramullas, Acer, LawExplorer, Lewizho99, JJMC89, Maathavan, Agabi10, Susannaanas, Aschroet, Jane023, Wikidata-bugs, PKM, Base, matthiasmullie, Ricordisamoa, Lydia_Pintscher, Fabrice_Florin, Raymond, Steinsplitter, Mbch331
Wikidata-bugs mailing list

Reply via email to