I suggested it on T20209#3535024 back in August, thanks Brad for taking care for it :)
Just to add a sidenote regarding user=0 and user_text with some non IP value - I saw it was quite common in Wikidata recentchanges table few months ago with rc_type=5 (RC_EXTERNAL), though I can't see such anymore. On Thu, Nov 30, 2017 at 7:31 PM, Brad Jorsch (Anomie) <[email protected] > wrote: > The proposal was approved by TechCom, the code has been merged, and it's > live now on the Beta Cluster. I'm running the maintenance script now. > Please test things there and report any bugs you encounter, either by > replying to this message or by filing it in Phabricator and adding me as a > subscriber. Assuming no major errors turn up that can't be quickly fixed, > I'll probably start running the maintenance script on the production wikis > the week of December 11 (and perhaps on mediawiki.org and testwiki the > week > before). > > If you're curious as to what the history of an existing imported page might > look like after the maintenance script is run, see > https://commons.wikimedia.beta.wmflabs.org/wiki/ > Template:Documentation?action=history > for an example. > > On Tue, Oct 31, 2017 at 10:52 AM, Brad Jorsch (Anomie) < > [email protected]> wrote: > > > Handling of usernames in imported edits in MediaWiki has long been weird > > (T9240[1] was filed in 2006!). > > > > If the local user doesn't exist, we get a strange row in the revision > > table where rev_user_text refers to a valid name while rev_user is 0 > which > > typically indicates an IP edit. Someone can later create the name, but > > rev_user remains 0, so depending on which field a tool looks at the > > revision may or may not be considered to actually belong to the > > newly-created user. > > > > If the local user does exist when the import is done, the edit is > > attributed to that user regardless of whether it's actually the same > user. > > See T179246[2] for an example where imported edits got attributed to the > > wrong account in pre-SUL times. > > > > In Gerrit change 386625[3] I propose to change that. > > > > - If revisions are imported using the "Upload XML data" method, it > > will be required to fill in a new field to indicate the source of the > > edits, which is intended to be interpreted as an interwiki prefix. > > - If revisions are imported using the."Import from another wiki" > > method, the specified source wiki will be used as the source. > > - During the import, any usernames that don't exist locally (and can't > > be auto-created via CentralAuth[4]) will be imported as an > > otherwise-invalid name, e.g. an edit by User:Example from source 'en' > would > > be imported as "en>Example".[5] > > - There will be a checkbox on Special:Import to specify whether the > > same should be done for usernames that do exist locally (or can be > created) > > or whether those edits should be attributed to the > existing/autocreated > > local user. > > - On history pages, log pages, and the like, these usernames will be > > displayed as interwiki links, much as might be generated by wikitext > like " > > [[:en:User:Example|en>Example]]". No parenthesized 'tool' links > (talk, > > block, and so on) will be generated for these rows. > > - On WMF wikis, we'll run a maintenance script to clean up the > > existing rows with valid usernames and rev_user = 0. The current plan > there > > is to attribute these edits to existing SUL users where possible and > to > > prefix them with a generic prefix otherwise, but we could as easily > prefix > > them all. > > - Unfortunately it's impossible to retroactively determine the > > actual source of old imports automatically or to automatically do > anything > > about imports that were misattributed to a different local user in > pre-SUL > > times (e.g. T179246[2]). > > - The same will be done for CentralAuth's global suppression > > blocks. In this case, on WMF wikis we can safely point them all at > Meta. > > > > If you have comments on this proposal, please reply here or on > > https://gerrit.wikimedia.org/r/#/c/386625/. > > > > > > Background: The upcoming actor table changes[6] require some change to > the > > handling of these imported names because we can't have separate > attribution > > to "Example as a non-registered user" and "Example as a registered user" > > with the new schema. The options we've identified are: > > > > 1. This proposal, or something much like it. > > 2. All the existing rows with rev_user = 0 would have to be attributed > > to the existing local user (if any), and in the future when a new > user is > > created any existing edits attributed to that name will be > automatically > > attributed to that new account. > > 3. All the existing rows with rev_user = 0 and an existing local user > > would have to be re-attributed to different *valid* usernames, > > probably randomly-generated in some manner, and in the future when a > new > > user is created any existing edits for that name would have to be > similarly > > re-attributed. > > 4. Like #2, except the creation (including SUL auto-creation) of the > > same-named account would not be allowed. Thus, an import before the > local > > name exists would forever block that name from being used for an > actual > > local account. > > 5. Some less consistent combination of the "all the existing rows" and > > "when a new user is created" options from #2–4. > > > > Of these options, this proposal seems like the best one. > > > > [1]: https://phabricator.wikimedia.org/T9240 > > [2]: https://phabricator.wikimedia.org/T179246 > > [3]: https://gerrit.wikimedia.org/r/#/c/386625/ > > [4]: https://phabricator.wikimedia.org/T111605 > > [5]: ">" was chosen rather than the more typical ":" because the former > is > > already invalid in all usernames (and page titles). While a colon is > *now* > > disallowed in new usernames, existing names created before that > restriction > > was added can continue to be used (and there are over 12000 such > usernames > > in WMF's SUL) and we decided it'd be better not to suddenly break them. > > [6]: https://phabricator.wikimedia.org/T167246 > > > > -- > > Brad Jorsch (Anomie) > > Senior Software Engineer > > Wikimedia Foundation > > > > > > -- > Brad Jorsch (Anomie) > Senior Software Engineer > Wikimedia Foundation > _______________________________________________ > Wikitech-l mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/wikitech-l > _______________________________________________ Wikitech-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikitech-l
