|hashar added a comment.|
T111441 is most probably a beast too large to get rid of in a short time, and I don't even know who could be diverted to work on it.
I would at least clear out the HHVM bytecode caches for to be safe. A server that failed had a 202MBytes file:
The table above is for today and show a lot of servers are around that size if not bigger, thus I highly suspect that will trigger the ulimit again.
Ideally, it would nice to figure out why the bytecode cache has to be written to. My assumption is that we should get it compiled once on each deploy, and mwscript would not have to mess with it.
When proceeding with the deployment, I highly recommend to do the jobrunners one at a time. Watch logstash for it. Note that /var/log/mediawiki/jobrunner.log is only readable by root for now (due to T146040)
Cc: thcipriani, Anomie, aaron, MZMcBride, Tobi_WMDE_SW, FastLizard4, JJMC89, zeljkofilipin, Lydia_Pintscher, daniel, aude, Addshore, Aklapper, greg, Legoktm, demon, gerritbot, Stashbot, hashar, Lewizho99, Maathavan, D3r1ck01, Liudvikas, Izno, Luke081515, Wikidata-bugs, ArielGlenn, JanZerebecki, Mbch331, Jay8g, Joe, jeremyb, mmodell
_______________________________________________ Wikidata-bugs mailing list Wikidatafirstname.lastname@example.org https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs