Joe added a comment.
Re-thinking about this: what we're really interested in is knowing what is
the max lag of a server that is receiving user traffic.
So I crafted the following metric in prometheus:
`max(time() - blazegraph_lastupdated and
rate(blazegraph_queries_done_total{}[5m]) > 10)`
This gives us the maximum lag for servers that are receiving a significant
amount of traffic.
I added a graph showing off this max lag calculation here:
https://grafana.wikimedia.org/d/000000489/wikidata-query-service?orgId=1&refresh=1m&from=now-24h&to=now&var-cluster_name=wdqs&viewPanel=41
TASK DETAIL
https://phabricator.wikimedia.org/T331405
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: Joe
Cc: Joe, Aklapper, dcausse, Lucas_Werkmeister_WMDE, Astuthiodit_1, AWesterinen,
Arnoldokoth, karapayneWMDE, Invadibot, MPhamWMF, maantietaja, wkandek,
JMeybohm, CBogen, ItamarWMDE, Akuckartz, Nandana, Namenlos314, jijiki, Lahi,
Gq86, GoranSMilovanovic, QZanden, EBjune, merbst, LawExplorer, _jensen,
rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude,
Tobias1984, Manybubbles, Mbch331
_______________________________________________
Wikidata-bugs mailing list -- [email protected]
To unsubscribe send an email to [email protected]