Joe added a comment.

After a few back and forth, We're pretty sure the cause of the outage was due 
to the changes in the jobchron service on the jobrunners that were released on 
saturday via

https://gerrit.wikimedia.org/r/#/c/208408/

when I correctly restarted the jobchron service (which is not named at all on 
the jobrunners deploy page on wikitech) after reverting that change the 
contentions on redis disappeared.

We earlier tracked the problem to redis maxing out 1 CPU, while blocked in the 
lua interpreter, which is probably single-threaded.


TASK DETAIL
  https://phabricator.wikimedia.org/T97930

REPLY HANDLER ACTIONS
  Reply to comment or attach files, or !close, !claim, !unsubscribe or !assign 
<username>.

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Joe
Cc: Stryn, Joe, Krenair, Steinsplitter, Jianhui67, Lydia_Pintscher, 
Sjoerddebruin, Romaine, Aklapper, Multichill, Wikidata-bugs, RobH, aude, 
GWicke, mark, faidon, fgiunchedi, Dzahn, chasemp



_______________________________________________
Wikidata-bugs mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs

Reply via email to