Hi, Pine. I, too, am interested in building our understanding of our TechOps infrastructure. https://www.mediawiki.org/wiki/Presentations has some explanations of some parts, as does http://wikitech.wikimedia.org/ . I welcome more links to guides/overviews.
At the recent Zurich hackathon, other developers agreed that it would be good to have a guide to Wikimedia's digital infrastructure, especially how MediaWiki is used. https://www.mediawiki.org/wiki/Overview_of_Wikimedia_infrastructure is .... a homepage with approximately nothing on it right now except this diagram of our server architecture: https://commons.wikimedia.org/wiki/File:Wikimedia_Server_Architecture_%28simplified%29.svg You might find the Performance Guidelines illuminating https://www.mediawiki.org/wiki/Performance_guidelines and you might also like the recent tech talk about how we make Wikipedia fast, by Ori Livneh and Aaron Schulz, recently - see http://www.youtube.com/watch?v=0PqJuZ1_B6w (I don't know when the video is going up on Commons). -- Sumana Harihareswara Senior Technical Writer Wikimedia Foundation On 05/30/2014 06:30 PM, ENWP Pine wrote: > > Ori, thanks for following up. > > I think I saw somewhere that there is a list of postmortems for tech ops > disruptions > that includes reports like this one. Do you know where the list is? I tried a > web search > and couldn't find a copy of this report outside of this email list. > > I personally find this report interesting and concise, and I am interested in > understanding more about the tech ops infrastructure. Reports like this one > are useful in building that understanding. If there's an overview of tech ops > somewhere I'd be interested in reading that too. The information on English > Wikipedia about WMF's server configuration appears to be outdated. > > Thanks, > > Pine > > >> Date: Thu, 29 May 2014 22:38:10 -0700 >> From: Ori Livneh <[email protected]> >> To: Wikimedia developers <[email protected]> >> Subject: Re: [Wikitech-l] 404 errors >> Message-ID: >> <cahxk4byya8ae0evgaufwscrjztaqh+sjtw6ccj14mb8o-te...@mail.gmail.com> >> Content-Type: text/plain; charset=UTF-8 >> >> On Thu, May 29, 2014 at 1:34 PM, ENWP Pine <[email protected]> wrote: >> >>> Hi, I'm getting some 404 errors consistently when trying to load some >>> English Wikipedia articles. Other pages load ok. Did something break? >>> >> >> TL;DR: A package update went badly. >> >> Nitty-gritty postmortem: >> >> At 20:25 (all times UTC), change Ie5a860eb9[0] ("Remove >> wikimedia-task-appserver from app servers") was merged. There were two >> things wrong with it: >> >> 1) The appserver package was configured to delete the mwdeploy and apache >> users upon removal. The apache user was not deleted because it was logged >> in, but the mwdeploy user was. The mwdeploy account was declared in Puppet, >> but there was a gap between the removal of the package and the next Puppet >> run during which the account would not be present. >> >> 2) The package included the symlinks /etc/apache2/wmf and >> /usr/local/apache/common, which were not Puppetized. These symlinks were >> unlinked when the package was removed. >> >> Apache was configured to load configuration files from /etc/apache2/wmf, >> and these include the files that declare the DocumentRoot and Directory >> directives for our sites. As a result, users were served with 404s. At >> 20:40 Faidon Liambotis re-installed wikimedia-task-appserver on all >> Apaches. Since 404s are cached in Varnish, it took another five minutes for >> the rate of 4xx responses to return to normal (20:45).[1] >> >> [0]: https://gerrit.wikimedia.org/r/#/c/136151/ >> [1]: >> https://graphite.wikimedia.org/render/?title=HTTP%204xx%20responses%2C%202014-05-29&from=20:00_20140529&until=21:00_20140529&target=reqstats.4xx&hideLegend=true _______________________________________________ Wikitech-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikitech-l
