GoranSMilovanovic closed this task as "Resolved".
GoranSMilovanovic added a comment.

@Lydia_Pintscher

  • Following the developments on T210147:
  • WDCM main update engine will run on weekly basis,
  • synced to start 10 hours after the onset of the Sqoop procedure (i.e. transfer from MariaDB to HDFS),
  • on 1st, 7th, 14th, 20th, and 27th each month - so we will have five monthly updates.

With the new procedures and following its scaling to Apache Spark we could run WDCM on daily basis with no trouble at all, except that we cannot do that because the Sqoop transfer from MariaDB (client wbc_entity_usage tables) to HDFS takes hours to complete. I will see if there is anything that I can do to speed it up, but I think the chances are slim.


TASK DETAIL
https://phabricator.wikimedia.org/T179286

EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: GoranSMilovanovic
Cc: Tobi_WMDE_SW, Addshore, Lydia_Pintscher, GoranSMilovanovic, Aklapper, Nandana, Lahi, Gq86, QZanden, LawExplorer, _jensen, D3r1ck01, Wikidata-bugs, aude, Mbch331
_______________________________________________
Wikidata-bugs mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs

Reply via email to