| Gehel added a subscriber: Mathew.onipe. Gehel added a comment. |
In T199228#4655321, @Smalyshev wrote:I think update lag is not the biggest issue. Endpoint availability and response times is more important for most of the users, at least short-term. If there's a lag spike that goes away, most users won't even notice (persistent lag is different of course). If however the user's queries time out, that is different.
If update lag is not a big issue for our users, then we should make it clear. We should increase the threshold on the Icinga alert and make our alerting match the reality of the expectations. That would already be a big step forward from my point of view. I (or @Mathew.onipe) would not have to try to depool servers to help them catch up, or feel bad for doing nothing when an alert is raised.
The problem that I see here is how to quantify what we want. We probably can reasonably promise endpoint availability, as in "can run trivial queries" (even that I would not be sure how to quantify). However, if we get to the "interesting" queries, the variety is so large then I am not sure how to express any guarantees in any certain terms. Maybe p95/p99? But that can be influenced by any random bot...
Do you have any SLOs in mind that we could look at and get an impression how that should look like?
I don't have anything that would directly apply to the WDQS use case. One of the limit is: "if you provide a crappy / expensive / wrong query, we're not going to give you good results in a good time frame". And WDQS is definitely more sensitive to this by nature than other services that we expose.
My concerns are coming from an operational stand point. Do I (or someone else) have to wake up during the night if I receive an alert from this service?
Cc: Mathew.onipe, Stashbot, Lydia_Pintscher, EBjune, debt, Joe, Smalyshev, Gehel, Aklapper, Nandana, AndyTan, Davinaclare77, Qtn1293, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, Th3d3v1ls, Hfbn0, QZanden, merbst, LawExplorer, Zppix, Jonas, Xmlizer, Wong128hk, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, faidon, Mbch331, Jay8g, fgiunchedi
_______________________________________________ Wikidata-bugs mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
