Hello everyone,

Just as Clément wrote in the first message, on *Wednesday April 26*, there
will be a switchover from our secondary data center in Texas back to the
primary data center in Virginia. That day, starting at *14:00 UTC*, there
will be a brief (just a few minutes) read-only period for all
Foundation-hosted wikis.

We have sent a message to all the wikis; there will also be a banner
informing about the maintenance, displayed 30 minutes before this operation
happens.

If you're interested in the details about the switchover, scroll up to
Clément's message :)

Thanks,

Szymon Grabarczuk (he/him)

Senior Community Relations Specialist

Wikimedia Foundation <https://wikimediafoundation.org/>



On Wed, Mar 1, 2023 at 9:11 PM Igal Khitron <[email protected]> wrote:

> I see. Very well, I'll open a task.
> Thank you.
> Igal
>
> בתאריך יום ד׳, 1 במרץ 2023, 22:07, מאת Brian Wolff ‏<[email protected]>:
>
>> You should probably just file a bug.
>>
>> Its certainly plausible it had something to do with data center switch,
>> but it could just as equally be unrelated. It requires someone to
>> investigate what part of the system failed (job queue? Varnish? Swift?)
>> which would then lead to a root cause. Its pretty impossible to say without
>> further investigation, and speculating on list is probably not helpful.
>>
>> --
>> Bawolff
>>
>> On Wednesday, March 1, 2023, Igal Khitron <[email protected]> wrote:
>>
>>> Hello.
>>> Is the new files bug happening because of the database switch? And if it
>>> is, do you fix it?
>>> The story: If you reupload file, local or on commons, the new version
>>> does not create thumb, and the old one is shown in articles. The only way
>>> to see a new version is opening the file in media: namespace. I waited for
>>> hours. So, you can't edit files any more. I read the commons reupload log a
>>> bit, looks like it's not just my problem. The regular things, clear cache,
>>> purge, null edit, do not help.
>>> I think there should be a phab task created, but I'd like to know your
>>> answer first.
>>> Thank you.
>>> Igal (user:IKhitron)
>>>
>>>
>>> בתאריך יום ד׳, 1 במרץ 2023, 19:26, מאת Giuseppe Lavagetto ‏<
>>> [email protected]>:
>>>
>>>> Specifically:
>>>>
>>>> if we measure read only time as "an editor can't start an edit because
>>>> wikis are read only", then the read-only time is 119s;
>>>> if we measure it by the last timestamp of an edit being saved, that's
>>>> 94 seconds.
>>>>
>>>> As Amir explained, we leave some room for propagation of the MediaWiki
>>>> read-only mode (about 10-15 seconds) and for in-flight edits (another 10
>>>> seconds) before we set the databases to read-only as well.
>>>>
>>>> I think 2 minutes of read-only for such a complex operation are the
>>>> good balance between reasonable change safety and reduction of impact; we
>>>> could reduce the read-only time by another 10-20 seconds with some more
>>>> aggressive moves (like clearing the DNS recursor caches) but I don't think
>>>> there's a big value there at this point.
>>>>
>>>> I'll add: if anyone is interested in knowing more and they're coming to
>>>> the hackathon, I'll be happy to make an impromptu session about how we
>>>> handle this procedure.
>>>>
>>>> Cheers,
>>>>
>>>> Giuseppe
>>>>
>>>> On Wed, Mar 1, 2023 at 6:16 PM Amir Sarabadani <[email protected]>
>>>> wrote:
>>>>
>>>>> It's a bit complicated.
>>>>> When SRE sets the read-only mark, they start counting from that time
>>>>> and it starts propagating which takes a while to be actually shown to all
>>>>> users but some users might still see the RO error while some actual writes
>>>>> are happening somewhere else because the cache is not invalidated yet (I
>>>>> think it has a TTL of 5 seconds but I need to double check). We still
>>>>> consider that as RO time because it's affecting users regardless.
>>>>>
>>>>> HTH
>>>>>
>>>>>
>>>>>
>>>>> Am Mi., 1. März 2023 um 18:06 Uhr schrieb Dušan Kreheľ <
>>>>> [email protected]>:
>>>>>
>>>>>> Clément Goubert and everybody,
>>>>>>
>>>>>> I analyzed https://stream.wikimedia.org/v2/stream/recentchange and i
>>>>>> have the another results.
>>>>>>
>>>>>> Last change (before migration): 2023-03-01T14:00:30
>>>>>> First change (after migration):    2023-03-01T14:02:05
>>>>>> Result: Down time (14:00:31 to 14:02:05) is 94s.
>>>>>>
>>>>>> I think that analysis is more authoritative. I think it analyzes based
>>>>>> on something like REQUEST_TIME in PHP.
>>>>>>
>>>>>> Dušan Kreheľ
>>>>>>
>>>>>>
>>>>>> 2023-03-01 16:30 GMT+01:00, Clément Goubert <[email protected]>:
>>>>>> > Dear Wikitechians,
>>>>>> >
>>>>>> > Dear colleagues,
>>>>>> >
>>>>>> > The switchover process requires a *brief read-only period for all
>>>>>> > Foundation-hosted wikis*, which started at *14:00 UTC on Wednesday
>>>>>> March
>>>>>> > 1st*, and lasted *119 seconds*. All our public and private wikis
>>>>>> continued
>>>>>> > to be available for reading as usual. Users saw a notification of
>>>>>> the
>>>>>> > upcoming maintenance, and anyone still editing was asked to try
>>>>>> again in a
>>>>>> > few minutes.
>>>>>> >
>>>>>> > As a side note, with other SREs we have been trying to discern the
>>>>>> effect
>>>>>> > of the Switchover in many of the graphs we have to monitor the
>>>>>> > infrastructure in https://grafana.wikimedia.org during Switchover.
>>>>>> In many,
>>>>>> > it's impossible to tell the event. The most discernible graph we
>>>>>> have is of
>>>>>> > the edit rate, which can be viewed here: Grafana
>>>>>> > <
>>>>>> https://grafana-rw.wikimedia.org/d/000000208/edit-count?from=1677673800000&orgId=1&to=1677681000000
>>>>>> >.
>>>>>> > Can you spot it? See the attached picture to help:
>>>>>> >
>>>>>> > I am extending thanks to everyone that was also present on IRC,
>>>>>> helping out
>>>>>> > in any way that they could. Thanks as well to Community Relations
>>>>>> who
>>>>>> > notified communities of the read-only window ahead of time. And
>>>>>> thanks to
>>>>>> > everyone that contributed to MultiDC
>>>>>> > <https://wikitech.wikimedia.org/wiki/Performance/Multi-DC_MediaWiki
>>>>>> >,
>>>>>> > especially Performance for pushing forward with the last parts of
>>>>>> it,
>>>>>> > allowing us to perform this Switchover faster and with more
>>>>>> confidence than
>>>>>> > ever before.
>>>>>> >
>>>>>> > If you wanna relive through the Switchover, here's a link to a
>>>>>> capture
>>>>>> > of Listen
>>>>>> > to Wikipedia <https://en.wikipedia.org/wiki/Listen_to_Wikipedia>
>>>>>> during the
>>>>>> > Switchover: Listen to the Switchover
>>>>>> > <
>>>>>> https://drive.google.com/file/d/1jqQUVCq3ksjOM5bKoIfCZ5Zt9RRW1Nl_/view?usp=share_link
>>>>>> >
>>>>>> > (spoiler:
>>>>>> > the part with no sounds is the switchover)
>>>>>> >
>>>>>> > A similar event will follow a few weeks later, when we move back to
>>>>>> > Virginia. This is currently scheduled for *Wednesday, April 26th*.
>>>>>> > Thank you,
>>>>>> >
>>>>>> > On Tue, Feb 21, 2023 at 1:55 PM Clément Goubert <
>>>>>> [email protected]>
>>>>>> > wrote:
>>>>>> >
>>>>>> >> Dear Wikitechians,
>>>>>> >>
>>>>>> >> I would like to remind you that the datacenter switchover will
>>>>>> happen on
>>>>>> >> *Wednesday
>>>>>> >> March 1st* starting at *14:00 UTC.*
>>>>>> >>
>>>>>> >> Please refer to the original email for any additional information.
>>>>>> As
>>>>>> >> always, you can reach out to me directly or the SRE team in
>>>>>> >> #wikimedia-sre
>>>>>> >> on IRC with any question, or through Phabricator.
>>>>>> >>
>>>>>> >> Thank you,
>>>>>> >>
>>>>>> >> On Tue, Feb 14, 2023 at 1:58 PM Clément Goubert <
>>>>>> [email protected]>
>>>>>> >> wrote:
>>>>>> >>
>>>>>> >>> Dear Wikitechians,
>>>>>> >>>
>>>>>> >>> On *Wednesday March 1st*, the SRE team will run a planned data
>>>>>> center
>>>>>> >>> switchover, moving all wikis from our primary data center in
>>>>>> Virginia to
>>>>>> >>> the secondary data center in Texas. This is an important periodic
>>>>>> test
>>>>>> >>> of
>>>>>> >>> our tools and procedures, to ensure the wikis will continue to be
>>>>>> >>> available
>>>>>> >>> even in the event of major technical issues in our primary home.
>>>>>> It also
>>>>>> >>> gives all our SRE and ops teams a chance to do maintenance and
>>>>>> upgrades
>>>>>> >>> on
>>>>>> >>> systems in Virginia that normally run 24 hours a day.
>>>>>> >>>
>>>>>> >>> The switchover process requires a *brief read-only period for all
>>>>>> >>> Foundation-hosted wikis*, which will start at *14:00 UTC on
>>>>>> Wednesday
>>>>>> >>> March 1st*, and will last for a few minutes while we execute the
>>>>>> >>> migration as efficiently as possible. All our public and private
>>>>>> wikis
>>>>>> >>> will
>>>>>> >>> be continuously available for reading as usual, but no one will
>>>>>> be able
>>>>>> >>> to
>>>>>> >>> save edits during the process. Users will see a notification of
>>>>>> the
>>>>>> >>> upcoming maintenance, and anyone still editing will be asked to
>>>>>> try
>>>>>> >>> again
>>>>>> >>> in a few minutes.
>>>>>> >>>
>>>>>> >>> CommRel has already begun notifying communities of the read-only
>>>>>> window.
>>>>>> >>> A similar event will follow a few weeks later, when we move back
>>>>>> to
>>>>>> >>> Virginia. This is currently scheduled for *Wednesday, April 26th*.
>>>>>> >>>
>>>>>> >>> If you like, you can follow along on the day in the public
>>>>>> >>> #wikimedia-operations channel on IRC (instructions for joining
>>>>>> here
>>>>>> >>> <https://meta.wikimedia.org/wiki/IRC/Instructions>). To report
>>>>>> any
>>>>>> >>> issues, you can reach us in #wikimedia-sre on IRC, or file a
>>>>>> Phabricator
>>>>>> >>> ticket with the *datacenter-switchover* tag (pre-filled form here
>>>>>> >>> <
>>>>>> https://phabricator.wikimedia.org/maniphest/task/edit/form/1/?projects=Datacenter-Switchover&subscribers=Clement_Goubert
>>>>>> >);
>>>>>> >>> we'll be monitoring closely for reports of trouble during and
>>>>>> after the
>>>>>> >>> switchover. (If you're new to Phab, there's more information at
>>>>>> >>> Phabricator/Help.) The switchover and its preparation are tracked
>>>>>> >>> tracked in Phabricator Task T327920
>>>>>> >>> <https://phabricator.wikimedia.org/T327920>
>>>>>> >>>
>>>>>> >>> On behalf of the SRE team, please excuse the disruption, and our
>>>>>> thanks
>>>>>> >>> to everyone in a number of departments who've been involved in
>>>>>> planning
>>>>>> >>> this work for the past weeks. Feel free to reply directly to me
>>>>>> with any
>>>>>> >>> questions.
>>>>>> >>>
>>>>>> >>> Thank you,
>>>>>> >>>
>>>>>> >>> --
>>>>>> >>> Clément Goubert (they/them)
>>>>>> >>> Senior SRE
>>>>>> >>> Wikimedia Foundation
>>>>>> >>>
>>>>>> >>
>>>>>> >>
>>>>>> >> --
>>>>>> >> Clément Goubert (they/them)
>>>>>> >> Senior SRE
>>>>>> >> Wikimedia Foundation
>>>>>> >>
>>>>>> >
>>>>>> >
>>>>>> > --
>>>>>> > Clément Goubert (they/them)
>>>>>> > Senior SRE
>>>>>> > Wikimedia Foundation
>>>>>> >
>>>>>> _______________________________________________
>>>>>> Wikitech-l mailing list -- [email protected]
>>>>>> To unsubscribe send an email to [email protected]
>>>>>>
>>>>>> https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/
>>>>>
>>>>>
>>>>>
>>>>> --
>>>>> Amir (he/him)
>>>>>
>>>>> _______________________________________________
>>>>> Wikitech-l mailing list -- [email protected]
>>>>> To unsubscribe send an email to [email protected]
>>>>>
>>>>> https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/
>>>>
>>>>
>>>>
>>>> --
>>>> Giuseppe Lavagetto
>>>> Principal Site Reliability Engineer, Wikimedia Foundation
>>>> _______________________________________________
>>>> Wikitech-l mailing list -- [email protected]
>>>> To unsubscribe send an email to [email protected]
>>>>
>>>> https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/
>>>
>>> _______________________________________________
>> Wikitech-l mailing list -- [email protected]
>> To unsubscribe send an email to [email protected]
>>
>> https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/
>
> _______________________________________________
> Wikitech-l mailing list -- [email protected]
> To unsubscribe send an email to [email protected]
> https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/
_______________________________________________
Wikitech-l mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/

Reply via email to