| GoranSMilovanovic closed this task as "Resolved". GoranSMilovanovic added a comment. |
- Following the developments on T210147:
- WDCM main update engine will run on weekly basis,
- synced to start 10 hours after the onset of the Sqoop procedure (i.e. transfer from MariaDB to HDFS),
- on 1st, 7th, 14th, 20th, and 27th each month - so we will have five monthly updates.
With the new procedures and following its scaling to Apache Spark we could run WDCM on daily basis with no trouble at all, except that we cannot do that because the Sqoop transfer from MariaDB (client wbc_entity_usage tables) to HDFS takes hours to complete. I will see if there is anything that I can do to speed it up, but I think the chances are slim.
TASK DETAIL
EMAIL PREFERENCES
To: GoranSMilovanovic
Cc: Tobi_WMDE_SW, Addshore, Lydia_Pintscher, GoranSMilovanovic, Aklapper, Nandana, Lahi, Gq86, QZanden, LawExplorer, _jensen, D3r1ck01, Wikidata-bugs, aude, Mbch331
Cc: Tobi_WMDE_SW, Addshore, Lydia_Pintscher, GoranSMilovanovic, Aklapper, Nandana, Lahi, Gq86, QZanden, LawExplorer, _jensen, D3r1ck01, Wikidata-bugs, aude, Mbch331
_______________________________________________ Wikidata-bugs mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
