I suggested it on T20209#3535024 back in August, thanks Brad for taking
care for it :)

Just to add a sidenote regarding user=0 and user_text with some non IP
value - I saw it was quite common in Wikidata recentchanges table few
months ago with rc_type=5 (RC_EXTERNAL), though I can't see such anymore.






On Thu, Nov 30, 2017 at 7:31 PM, Brad Jorsch (Anomie) <[email protected]
> wrote:

> The proposal was approved by TechCom, the code has been merged, and it's
> live now on the Beta Cluster. I'm running the maintenance script now.
> Please test things there and report any bugs you encounter, either by
> replying to this message or by filing it in Phabricator and adding me as a
> subscriber. Assuming no major errors turn up that can't be quickly fixed,
> I'll probably start running the maintenance script on the production wikis
> the week of December 11 (and perhaps on mediawiki.org and testwiki the
> week
> before).
>
> If you're curious as to what the history of an existing imported page might
> look like after the maintenance script is run, see
> https://commons.wikimedia.beta.wmflabs.org/wiki/
> Template:Documentation?action=history
> for an example.
>
> On Tue, Oct 31, 2017 at 10:52 AM, Brad Jorsch (Anomie) <
> [email protected]> wrote:
>
> > Handling of usernames in imported edits in MediaWiki has long been weird
> > (T9240[1] was filed in 2006!).
> >
> > If the local user doesn't exist, we get a strange row in the revision
> > table where rev_user_text refers to a valid name while rev_user is 0
> which
> > typically indicates an IP edit. Someone can later create the name, but
> > rev_user remains 0, so depending on which field a tool looks at the
> > revision may or may not be considered to actually belong to the
> > newly-created user.
> >
> > If the local user does exist when the import is done, the edit is
> > attributed to that user regardless of whether it's actually the same
> user.
> > See T179246[2] for an example where imported edits got attributed to the
> > wrong account in pre-SUL times.
> >
> > In Gerrit change 386625[3] I propose to change that.
> >
> >    - If revisions are imported using the "Upload XML data" method, it
> >    will be required to fill in a new field to indicate the source of the
> >    edits, which is intended to be interpreted as an interwiki prefix.
> >    - If revisions are imported using the."Import from another wiki"
> >    method, the specified source wiki will be used as the source.
> >    - During the import, any usernames that don't exist locally (and can't
> >    be auto-created via CentralAuth[4]) will be imported as an
> >    otherwise-invalid name, e.g. an edit by User:Example from source 'en'
> would
> >    be imported as "en>Example".[5]
> >    - There will be a checkbox on Special:Import to specify whether the
> >    same should be done for usernames that do exist locally (or can be
> created)
> >    or whether those edits should be attributed to the
> existing/autocreated
> >    local user.
> >    - On history pages, log pages, and the like, these usernames will be
> >    displayed as interwiki links, much as might be generated by wikitext
> like "
> >    [[:en:User:Example|en>Example]]". No parenthesized 'tool' links
> (talk,
> >    block, and so on) will be generated for these rows.
> >    - On WMF wikis, we'll run a maintenance script to clean up the
> >    existing rows with valid usernames and rev_user = 0. The current plan
> there
> >    is to attribute these edits to existing SUL users where possible and
> to
> >    prefix them with a generic prefix otherwise, but we could as easily
> prefix
> >    them all.
> >       - Unfortunately it's impossible to retroactively determine the
> >       actual source of old imports automatically or to automatically do
> anything
> >       about imports that were misattributed to a different local user in
> pre-SUL
> >       times (e.g. T179246[2]).
> >       - The same will be done for CentralAuth's global suppression
> >    blocks. In this case, on WMF wikis we can safely point them all at
> Meta.
> >
> > If you have comments on this proposal, please reply here or on
> > https://gerrit.wikimedia.org/r/#/c/386625/.
> >
> >
> > Background: The upcoming actor table changes[6] require some change to
> the
> > handling of these imported names because we can't have separate
> attribution
> > to "Example as a non-registered user" and "Example as a registered user"
> > with the new schema. The options we've identified are:
> >
> >    1. This proposal, or something much like it.
> >    2. All the existing rows with rev_user = 0 would have to be attributed
> >    to the existing local user (if any), and in the future when a new
> user is
> >    created any existing edits attributed to that name will be
> automatically
> >    attributed to that new account.
> >    3. All the existing rows with rev_user = 0 and an existing local user
> >    would have to be re-attributed to different *valid* usernames,
> >    probably randomly-generated in some manner, and in the future when a
> new
> >    user is created any existing edits for that name would have to be
> similarly
> >    re-attributed.
> >    4. Like #2, except the creation (including SUL auto-creation) of the
> >    same-named account would not be allowed. Thus, an import before the
> local
> >    name exists would forever block that name from being used for an
> actual
> >    local account.
> >    5. Some less consistent combination of the "all the existing rows" and
> >    "when a new user is created" options from #2–4.
> >
> > Of these options, this proposal seems like the best one.
> >
> > [1]: https://phabricator.wikimedia.org/T9240
> > [2]: https://phabricator.wikimedia.org/T179246
> > [3]: https://gerrit.wikimedia.org/r/#/c/386625/
> > [4]: https://phabricator.wikimedia.org/T111605
> > [5]: ">" was chosen rather than the more typical ":" because the former
> is
> > already invalid in all usernames (and page titles). While a colon is
> *now*
> > disallowed in new usernames, existing names created before that
> restriction
> > was added can continue to be used (and there are over 12000 such
> usernames
> > in WMF's SUL) and we decided it'd be better not to suddenly break them.
> > [6]: https://phabricator.wikimedia.org/T167246
> >
> > --
> > Brad Jorsch (Anomie)
> > Senior Software Engineer
> > Wikimedia Foundation
> >
>
>
>
> --
> Brad Jorsch (Anomie)
> Senior Software Engineer
> Wikimedia Foundation
> _______________________________________________
> Wikitech-l mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
_______________________________________________
Wikitech-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to