hoo added a comment.
@Smalyshev Do you think it would be enough to look at `http://prometheus.svc.eqiad.wmnet/ops/api/v1/query?query=blazegraph_lastupdated` and `http://prometheus.svc.codfw.wmnet/ops/api/v1/query?query=blazegraph_lastupdated` (no matter which DC MW is running in) and just take all servers into account? Or do we need a whitelist/blacklist (or both) or some other mechanism to make sure we don't eg. take servers into account that are being maintained. If we use median to aggregate the lags (or maybe even a high percentile?) we might have robust enough results even if a few servers are in maintenance? TASK DETAIL https://phabricator.wikimedia.org/T221774 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: hoo Cc: Ladsgroup, Smalyshev, fgiunchedi, hoo, Daniel_Mietchen, MisterSynergy, Addshore, Sjoerddebruin, Aklapper, Lucas_Werkmeister_WMDE, darthmon_wmde, alaa_wmde, Nandana, Lahi, Gq86, GoranSMilovanovic, Chicocvenancio, QZanden, EBjune, merbst, LawExplorer, Volans, _jensen, rosalieper, Jonas, Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Lydia_Pintscher, Mbch331
_______________________________________________ Wikidata-bugs mailing list Wikidata-bugs@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs