The incident report is now posted on wikitech: https://wikitech.wikimedia.org/wiki/Incident_documentation/20150205-SiteOutage
On Thu, Feb 5, 2015 at 7:57 PM, Mark Bergsma <[email protected]> wrote: > Hi all, > > We've indeed had a total site outage for roughly 30 minutes. We're still > collecting all data, but we've tracked down the cause to multiple cascading > issues including loss of power to a critical SPOF network switch and HHVM > MediaWiki application servers getting blocked due to multiple unoptimal > timeout settings. We'll post a full incident report soon, and work to > correct the underlying issues as soon as possible. > > Our apologies, > > On Thu, Feb 5, 2015 at 7:03 PM, Guillaume Paumier <[email protected]> > wrote: > >> Hi, >> >> Le jeudi 5 février 2015, 09:58:01 George Herbert a écrit : >> > I saw a WMF tweet of a site outage (network?) around 9:30am Pacific >> time, by >> > the time I could check now things seem ok on en >> >> Sites are mostly back up but there are still issues with login, so the Ops >> team hasn't had time to write a postmortem yet. >> >> -- >> Guillaume Paumier >> >> _______________________________________________ >> Wikitech-l mailing list >> [email protected] >> https://lists.wikimedia.org/mailman/listinfo/wikitech-l >> > > > > -- > Mark Bergsma <[email protected]> > Lead Operations Architect > Director of Technical Operations > Wikimedia Foundation > -- Mark Bergsma <[email protected]> Lead Operations Architect Director of Technical Operations Wikimedia Foundation _______________________________________________ Wikitech-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikitech-l
