On Tue, Dec 11, 2018, 9:18 PM Krinkle <krinklem...@gmail.com> wrote:

> 📘 Read this post on Phabricator at
> https://phabricator.wikimedia.org/phame/live/1/post/129/
> -------
>
> How’d we do in our strive for operational excellence last month? Read on to
> find out!
>
> - Month in numbers.
> - Current problems.
> - Highlighted stories.
>
> ## 📊 *Month in numbers*
>
> * 4 documented incidents in November 2018. [1]
> * 42 Wikimedia-prod-error tasks closed in November 2018. [2]
> * 36 Wikimedia-prod-error tasks created in November 2018. [3]
> * 165 currently open Wikimedia-prod-error tasks (as of 12 December 2018).
>
> Terminology:
> * An *Exception* (or fatal) causes user actions to be prevented. For
> example, a page would display  “Exception: Unable to render page”, instead
> the article content.
> * An *Error* (or non-fatal, or warning) can produce page views that are
> technically unaware of a problem, but may show corrupt, incorrect, or
> incomplete information.  Examples – an article would display the code word
> “null” instead of the actual content, a user looking for Vegetables may be
> taken to an article about Vegetarians, a user may receive a notification
> that says “*You have (null) new messages.*”
>
> With that behind us... Let’s celebrate this month’s highlights!
>
> ## *️⃣ *DB exception at wikitech.wikimedia.org
> <http://wikitech.wikimedia.org>*
>
> Quiddity reported that he was unable to disable a spam account, due to a
> fatal exception. Andre Klapper used the Exception ID to find the stack
> trace in the logs. The trace revealed that a table was missing in
> Wikitech’s database.
>
> The MediaWiki software was recently expanded with a “Partial blocking”
> ability. [4] This involved introducing a new database table that stores
> block metadata differently. This software update was deployed to Wikitech,
> but this new table was not created.
>
> @Marostegui (Database administrator) quickly applied the schema patches
> that create the missing table. Thanks Manuel, Andre, and Quiddity;
> Teamwork!
>
> – https://phabricator.wikimedia.org/T209674
>
> ## *️⃣ *Big-page Deletion Unleashed!*
>
> It had been known for years, [5] that users are unable to delete or restore
> pages with more than a few hundred revisions. Attempts to do so could fail,
> with a fatal “DBTransactionSizeError” exception. This error indicates that
> the change is too big or too slow. Such changes risk replication lag, and
> may impact the stability of the infrastructure.
>
> The database structure used by MediaWiki for page archives dates back to
> 2003 (over 15 years ago). I'll spare you the details, but it depends on
> database interactions that are inherently slow when applied to systems as
> big as Wikipedia! RFC T20493 intends to modernise this structure for the
> long-term.
>
> Then along came @BPirkle. Bill joined the WMF Core platform team earlier
> this year. He took on the challenge of making page deletion work for any
> size page, today.
>
> Previously, page deletion happened in a single step. This simple approach
> had the benefit of either succeeding in its entirety, or safely rolling
> back like nothing happened. It also meant that the database protected us
> against conflicting changes. In August, Bill started a two-month effort
> that carefully split the logic for “delete a page” into smaller steps that
> each are safe and quick. It now uses our JobQueue to schedule and run these
> steps, without the user waiting for it.
>
> –  https://phabricator.wikimedia.org/T198176 /
> https://gerrit.wikimedia.org/r/456035
>
> ## 📉 *Current problems*
>
> Take a look at the workboard and look for tasks that might need your help.
> The workboard lists known issues, grouped by the week in which they were
> first observed.
>
> →  https://phabricator.wikimedia.org/tag/wikimedia-production-error/
>
> I’d like to draw attention to a subset of PHP fatal errors. Specifically,
> those that are publicly exposed (e.g. don’t require elevated user rights)
> and use an HTTP 500 status code.
>
> * CentralNotice: Some Special:CentralNoticeBanners urls fatal. –
> https://phabricator.wikimedia.org/T149240
> * Flow: Unable to view certain talk pages due to workflow
> InvalidDataException. – https://phabricator.wikimedia.org/T70526
> * JsonConfig: Unable to diff certain “.map” pages on Commons. –
> https://phabricator.wikimedia.org/T203063
> * MediaWiki (Parser): Parse API exposes fatal content model error. –
> https://phabricator.wikimedia.org/T206253
> * MediaWiki (Special-pages): Special:DoubleRedirects unavailable on ttwiki.
> – https://phabricator.wikimedia.org/T204800
> * MobileFrontend: Some Special:MobileDiff urls fatal. –
> https://phabricator.wikimedia.org/T156293



Note this is not a mobilefrontend issue but an issue with the MassMessage
extension - it impacts desktop too

See
https://www.mediawiki.org/w/index.php?title=User%3AQuiddity%2Fdemomodel&type=revision&diff=2234116&oldid=2234115

>
> * ProofreadPage: Unable to edit certain pages on Wikisource. –
> https://phabricator.wikimedia.org/T176196
> * Translate: Some Special:Translate urls fatal. –
> https://phabricator.wikimedia.org/T204833
> * Wikibase: Clicking “undo” for some revisions fatals with a
> PatcherException. – https://phabricator.wikimedia.org/T97146
>
> Public user requests resulting in fatals can (and have) caused alerts to
> fire that notify SRE of wikis potentially being less available or down.
>
> 💡*ProTip*: Cross-reference one workboard with another via “Open Tasks” >
> “Advanced Filter” and enter Tag(s) to apply as a filter.
>
> ## 🎉 *Thank you*
>
> Thank you to everyone who helped by reporting or investigating problems in
> Wikimedia production; and for implementing or reviewing their solutions.
> Including: tstarling, thiemowmde, thcipriani, Tgr, Steinsplitter, Quiddity,
> pmiazga, Nikerabbit, Mvolz, Lucas_Werkmeister_WMDE, kostajh, jrbs, JJMC89,
> Jdforrester-WMF, hashar, Gilles, Daimona, Ciencia_Al_Poder, Catrope,
> BPirkle, Barkeep49, Anomie, and Aklapper.
>
> Thanks!
>
> Until next time,
>
> – Timo Tijhof
>
> -------
>
> Footnotes:
>
> [1] Incidents. –
>
> https://wikitech.wikimedia.org/wiki/Special:AllPages?from=Incident+documentation%2F20181101&to=Incident+documentation%2F20181131&namespace=0
>
> [2] Tasks closed. –
> https://phabricator.wikimedia.org/maniphest/query/.PkyGL4Rz_4i/#R
>
> [3] Tasks opened. –
> https://phabricator.wikimedia.org/maniphest/query/WsqbAxlHPLwk/#R
>
> [4] Partial blocks. –
>
> https://meta.wikimedia.org/wiki/Community_health_initiative/Per-user_page,_namespace,_and_upload_blocking
>
> [5] Bug report about page deletion, 2007. –
> https://phabricator.wikimedia.org/T13402
> _______________________________________________
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l

-- 
Jon Robson
twitter: @jdlrobson
linkedin: https://www.linkedin.com/in/jorobson/
_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to