Hi!

> Well, we only noticed what was up due to this email!
> Take a look at https://phabricator.wikimedia.org/T119915

Yes, we need to look into it. The problem is that the service has two
failure modes:

1. Completely dead, rejecting all queries. This would be caught by
icinga and alerted.

2. Crawling slow, but still partially alive, just performing very very
badly. For this one, we do not have adequate alert system. This failure
mode is rare, but we've seen it to happen, both due to somebody sending
a torrent of heavy queries and some bug scenarios. Icinga does not catch
that because it only checks very basic queries and those are still under
timeout.

-- 
Stas Malyshev
smalys...@wikimedia.org

_______________________________________________
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Reply via email to