[Wikidata-bugs] [Maniphest] [Commented On] T176927: WDQS updater crashed

Yurik Tue, 24 Oct 2017 16:48:07 -0700

Yurik added a comment.

I also disagree :) The real monitoring should not look at the process running at all. It should only look at the last timestamp - see how far behind WDQS is. If it gets behind further than X, send the alert - and that would be a very stable indicator that something is wrong - no matter if its the process that hung, or crashed, or simply cannot cope with the amount of data. On the other hand, the updater service itself should be resiliant to any kinds of problems - if there is an intermittent problem like a temporary DNS is down (like I had), the service will continue trying, and will self-recover the moment network is back up. This is the same logic as in any router or replication service - they always keeps trying until succeeding.

TASK DETAIL

https://phabricator.wikimedia.org/T176927

EMAIL PREFERENCES

https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Yurik
Cc: Smalyshev, Aklapper, Yurik, Lucas_Werkmeister_WMDE, GoranSMilovanovic, QZanden, EBjune, merbst, Avner, debt, Gehel, Jonas, FloNight, Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Mbch331

_______________________________________________
Wikidata-bugs mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs

[Wikidata-bugs] [Maniphest] [Commented On] T176927: WDQS updater crashed

Reply via email to