dcausse moved this task from In Progress to Needs review on the 
Discovery-Search (Current work) board.
dcausse added a comment.


  - Streaming updater pipeline is not failing because of recoverable errors: 
the sole recoverable errors I can think of now are transient failures of the 
various connection points: hdfs, kafka, checkpoint timeouts, these seem rare 
and could be handled using flink restart strategies. We might perhaps want to 
make this configurable in the future.
  - Calls to wikibase are retried (see current implementation of 
WikibaseRepository - it has a mechanism for retries already): wikibase fetches 
are retried 4 times and then they produce a FailedOp, they do not fail the 
pipeline
  - If error that fails the pipeline is unrecoverable, Flink shouldn't retry ad 
infinitum: related to the retry strategies, currently we do not allow a single 
unexpected failure
  - Data errors ( e.g. revisions after page delete, constant 404 on new 
revision) should be logged to a file on HDFS in structured form: done

TASK DETAIL
  https://phabricator.wikimedia.org/T248449

WORKBOARD
  https://phabricator.wikimedia.org/project/board/1227/

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: dcausse, Aklapper, Zbyszko, Alter-paule, Beast1978, CBogen, Un1tY, 
Akuckartz, Hook696, darthmon_wmde, Kent7301, joker88john, CucyNoiD, Nandana, 
Namenlos314, Gaboe420, Giuliamocci, Cpaulf30, Lahi, Gq86, Af420, Bsandipan, 
Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, 
LawExplorer, Lewizho99, Maathavan, _jensen, rosalieper, Scott_WUaS, Jonas, 
Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331
_______________________________________________
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs

Reply via email to