hoo added a comment. In https://phabricator.wikimedia.org/T123867#1961241, @jcrespo wrote:
> I suppose it is possible, there is a spike of 26 seconds on lag on db1045, > but probably only for a few seconds. But it is not s5 going read only- it is > the API going read only because "too much lag" on db1045. It is a protection > measure. It means it is working as intended. We can minimize this happening > (we mentioned having a second API server, when we get the hardware), avoid > lag by fixing mediawiki's queries, but this is not an error that should not > happen- it is meant to force users to retry and not saturate the servers. The API only goes read only if more than half of the servers are lagged for more than 5s. That really should not happen unless there actually are too many writes. I agree that it's ok for that to happen sometimes, but it shouldn't happen often. To actually know how often this happens, I would like to get https://gerrit.wikimedia.org/r/264595 merged and deployed, please review. Maybe we should increase `APIMaxLagThreshold` to 7s or even 10s? In order to be able to make such choices, we might also want to log the lag times in case the API goes read only. TASK DETAIL https://phabricator.wikimedia.org/T123867 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: hoo Cc: Aklapper, StudiesWorld, aaron, daniel, aude, Lydia_Pintscher, Multichill, jcrespo, hoo, Wikidata-bugs, Mbch331, Krenair _______________________________________________ Wikidata-bugs mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
