I see. Very well, I'll open a task. Thank you. Igal בתאריך יום ד׳, 1 במרץ 2023, 22:07, מאת Brian Wolff <[email protected]>:
> You should probably just file a bug. > > Its certainly plausible it had something to do with data center switch, > but it could just as equally be unrelated. It requires someone to > investigate what part of the system failed (job queue? Varnish? Swift?) > which would then lead to a root cause. Its pretty impossible to say without > further investigation, and speculating on list is probably not helpful. > > -- > Bawolff > > On Wednesday, March 1, 2023, Igal Khitron <[email protected]> wrote: > >> Hello. >> Is the new files bug happening because of the database switch? And if it >> is, do you fix it? >> The story: If you reupload file, local or on commons, the new version >> does not create thumb, and the old one is shown in articles. The only way >> to see a new version is opening the file in media: namespace. I waited for >> hours. So, you can't edit files any more. I read the commons reupload log a >> bit, looks like it's not just my problem. The regular things, clear cache, >> purge, null edit, do not help. >> I think there should be a phab task created, but I'd like to know your >> answer first. >> Thank you. >> Igal (user:IKhitron) >> >> >> בתאריך יום ד׳, 1 במרץ 2023, 19:26, מאת Giuseppe Lavagetto < >> [email protected]>: >> >>> Specifically: >>> >>> if we measure read only time as "an editor can't start an edit because >>> wikis are read only", then the read-only time is 119s; >>> if we measure it by the last timestamp of an edit being saved, that's >>> 94 seconds. >>> >>> As Amir explained, we leave some room for propagation of the MediaWiki >>> read-only mode (about 10-15 seconds) and for in-flight edits (another 10 >>> seconds) before we set the databases to read-only as well. >>> >>> I think 2 minutes of read-only for such a complex operation are the good >>> balance between reasonable change safety and reduction of impact; we could >>> reduce the read-only time by another 10-20 seconds with some more >>> aggressive moves (like clearing the DNS recursor caches) but I don't think >>> there's a big value there at this point. >>> >>> I'll add: if anyone is interested in knowing more and they're coming to >>> the hackathon, I'll be happy to make an impromptu session about how we >>> handle this procedure. >>> >>> Cheers, >>> >>> Giuseppe >>> >>> On Wed, Mar 1, 2023 at 6:16 PM Amir Sarabadani <[email protected]> >>> wrote: >>> >>>> It's a bit complicated. >>>> When SRE sets the read-only mark, they start counting from that time >>>> and it starts propagating which takes a while to be actually shown to all >>>> users but some users might still see the RO error while some actual writes >>>> are happening somewhere else because the cache is not invalidated yet (I >>>> think it has a TTL of 5 seconds but I need to double check). We still >>>> consider that as RO time because it's affecting users regardless. >>>> >>>> HTH >>>> >>>> >>>> >>>> Am Mi., 1. März 2023 um 18:06 Uhr schrieb Dušan Kreheľ < >>>> [email protected]>: >>>> >>>>> Clément Goubert and everybody, >>>>> >>>>> I analyzed https://stream.wikimedia.org/v2/stream/recentchange and i >>>>> have the another results. >>>>> >>>>> Last change (before migration): 2023-03-01T14:00:30 >>>>> First change (after migration): 2023-03-01T14:02:05 >>>>> Result: Down time (14:00:31 to 14:02:05) is 94s. >>>>> >>>>> I think that analysis is more authoritative. I think it analyzes based >>>>> on something like REQUEST_TIME in PHP. >>>>> >>>>> Dušan Kreheľ >>>>> >>>>> >>>>> 2023-03-01 16:30 GMT+01:00, Clément Goubert <[email protected]>: >>>>> > Dear Wikitechians, >>>>> > >>>>> > Dear colleagues, >>>>> > >>>>> > The switchover process requires a *brief read-only period for all >>>>> > Foundation-hosted wikis*, which started at *14:00 UTC on Wednesday >>>>> March >>>>> > 1st*, and lasted *119 seconds*. All our public and private wikis >>>>> continued >>>>> > to be available for reading as usual. Users saw a notification of the >>>>> > upcoming maintenance, and anyone still editing was asked to try >>>>> again in a >>>>> > few minutes. >>>>> > >>>>> > As a side note, with other SREs we have been trying to discern the >>>>> effect >>>>> > of the Switchover in many of the graphs we have to monitor the >>>>> > infrastructure in https://grafana.wikimedia.org during Switchover. >>>>> In many, >>>>> > it's impossible to tell the event. The most discernible graph we >>>>> have is of >>>>> > the edit rate, which can be viewed here: Grafana >>>>> > < >>>>> https://grafana-rw.wikimedia.org/d/000000208/edit-count?from=1677673800000&orgId=1&to=1677681000000 >>>>> >. >>>>> > Can you spot it? See the attached picture to help: >>>>> > >>>>> > I am extending thanks to everyone that was also present on IRC, >>>>> helping out >>>>> > in any way that they could. Thanks as well to Community Relations who >>>>> > notified communities of the read-only window ahead of time. And >>>>> thanks to >>>>> > everyone that contributed to MultiDC >>>>> > <https://wikitech.wikimedia.org/wiki/Performance/Multi-DC_MediaWiki >>>>> >, >>>>> > especially Performance for pushing forward with the last parts of it, >>>>> > allowing us to perform this Switchover faster and with more >>>>> confidence than >>>>> > ever before. >>>>> > >>>>> > If you wanna relive through the Switchover, here's a link to a >>>>> capture >>>>> > of Listen >>>>> > to Wikipedia <https://en.wikipedia.org/wiki/Listen_to_Wikipedia> >>>>> during the >>>>> > Switchover: Listen to the Switchover >>>>> > < >>>>> https://drive.google.com/file/d/1jqQUVCq3ksjOM5bKoIfCZ5Zt9RRW1Nl_/view?usp=share_link >>>>> > >>>>> > (spoiler: >>>>> > the part with no sounds is the switchover) >>>>> > >>>>> > A similar event will follow a few weeks later, when we move back to >>>>> > Virginia. This is currently scheduled for *Wednesday, April 26th*. >>>>> > Thank you, >>>>> > >>>>> > On Tue, Feb 21, 2023 at 1:55 PM Clément Goubert < >>>>> [email protected]> >>>>> > wrote: >>>>> > >>>>> >> Dear Wikitechians, >>>>> >> >>>>> >> I would like to remind you that the datacenter switchover will >>>>> happen on >>>>> >> *Wednesday >>>>> >> March 1st* starting at *14:00 UTC.* >>>>> >> >>>>> >> Please refer to the original email for any additional information. >>>>> As >>>>> >> always, you can reach out to me directly or the SRE team in >>>>> >> #wikimedia-sre >>>>> >> on IRC with any question, or through Phabricator. >>>>> >> >>>>> >> Thank you, >>>>> >> >>>>> >> On Tue, Feb 14, 2023 at 1:58 PM Clément Goubert < >>>>> [email protected]> >>>>> >> wrote: >>>>> >> >>>>> >>> Dear Wikitechians, >>>>> >>> >>>>> >>> On *Wednesday March 1st*, the SRE team will run a planned data >>>>> center >>>>> >>> switchover, moving all wikis from our primary data center in >>>>> Virginia to >>>>> >>> the secondary data center in Texas. This is an important periodic >>>>> test >>>>> >>> of >>>>> >>> our tools and procedures, to ensure the wikis will continue to be >>>>> >>> available >>>>> >>> even in the event of major technical issues in our primary home. >>>>> It also >>>>> >>> gives all our SRE and ops teams a chance to do maintenance and >>>>> upgrades >>>>> >>> on >>>>> >>> systems in Virginia that normally run 24 hours a day. >>>>> >>> >>>>> >>> The switchover process requires a *brief read-only period for all >>>>> >>> Foundation-hosted wikis*, which will start at *14:00 UTC on >>>>> Wednesday >>>>> >>> March 1st*, and will last for a few minutes while we execute the >>>>> >>> migration as efficiently as possible. All our public and private >>>>> wikis >>>>> >>> will >>>>> >>> be continuously available for reading as usual, but no one will be >>>>> able >>>>> >>> to >>>>> >>> save edits during the process. Users will see a notification of the >>>>> >>> upcoming maintenance, and anyone still editing will be asked to try >>>>> >>> again >>>>> >>> in a few minutes. >>>>> >>> >>>>> >>> CommRel has already begun notifying communities of the read-only >>>>> window. >>>>> >>> A similar event will follow a few weeks later, when we move back to >>>>> >>> Virginia. This is currently scheduled for *Wednesday, April 26th*. >>>>> >>> >>>>> >>> If you like, you can follow along on the day in the public >>>>> >>> #wikimedia-operations channel on IRC (instructions for joining here >>>>> >>> <https://meta.wikimedia.org/wiki/IRC/Instructions>). To report any >>>>> >>> issues, you can reach us in #wikimedia-sre on IRC, or file a >>>>> Phabricator >>>>> >>> ticket with the *datacenter-switchover* tag (pre-filled form here >>>>> >>> < >>>>> https://phabricator.wikimedia.org/maniphest/task/edit/form/1/?projects=Datacenter-Switchover&subscribers=Clement_Goubert >>>>> >); >>>>> >>> we'll be monitoring closely for reports of trouble during and >>>>> after the >>>>> >>> switchover. (If you're new to Phab, there's more information at >>>>> >>> Phabricator/Help.) The switchover and its preparation are tracked >>>>> >>> tracked in Phabricator Task T327920 >>>>> >>> <https://phabricator.wikimedia.org/T327920> >>>>> >>> >>>>> >>> On behalf of the SRE team, please excuse the disruption, and our >>>>> thanks >>>>> >>> to everyone in a number of departments who've been involved in >>>>> planning >>>>> >>> this work for the past weeks. Feel free to reply directly to me >>>>> with any >>>>> >>> questions. >>>>> >>> >>>>> >>> Thank you, >>>>> >>> >>>>> >>> -- >>>>> >>> Clément Goubert (they/them) >>>>> >>> Senior SRE >>>>> >>> Wikimedia Foundation >>>>> >>> >>>>> >> >>>>> >> >>>>> >> -- >>>>> >> Clément Goubert (they/them) >>>>> >> Senior SRE >>>>> >> Wikimedia Foundation >>>>> >> >>>>> > >>>>> > >>>>> > -- >>>>> > Clément Goubert (they/them) >>>>> > Senior SRE >>>>> > Wikimedia Foundation >>>>> > >>>>> _______________________________________________ >>>>> Wikitech-l mailing list -- [email protected] >>>>> To unsubscribe send an email to [email protected] >>>>> >>>>> https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/ >>>> >>>> >>>> >>>> -- >>>> Amir (he/him) >>>> >>>> _______________________________________________ >>>> Wikitech-l mailing list -- [email protected] >>>> To unsubscribe send an email to [email protected] >>>> >>>> https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/ >>> >>> >>> >>> -- >>> Giuseppe Lavagetto >>> Principal Site Reliability Engineer, Wikimedia Foundation >>> _______________________________________________ >>> Wikitech-l mailing list -- [email protected] >>> To unsubscribe send an email to [email protected] >>> >>> https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/ >> >> _______________________________________________ > Wikitech-l mailing list -- [email protected] > To unsubscribe send an email to [email protected] > https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/
_______________________________________________ Wikitech-l mailing list -- [email protected] To unsubscribe send an email to [email protected] https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/
