Thanks Brian, Great for the link to Php72ToUpper.php ! I think I understand with it : for example, the first line says 'ƀ' => 'ƀ', which should mean that this letter shouldn't be converted to uppercase by MW ? That's one of the letter I found that wasn't converted to uppercase and that was generating a false positive in my code : so it's because specific MW code is preventing the conversion :-)
Nico On Sun, Aug 4, 2019 at 1:32 AM bawolff <bawolff...@gmail.com> wrote: > MediaWiki uses php's mb_strtoupper. > > I believe this will use normal unicode uppercase algorithm. However this > can vary depending on version of unicode. We are currently in the process > of switching to php7, but for the moment we are still using HHVM's > uppercasing code. There's a list of differences between hhvm and php7.2 > uppercasing at > > https://github.com/wikimedia/operations-mediawiki-config/blob/master/wmf-config/Php72ToUpper.php > [All this is probably subject to change] > > However, I am at a loss as to why hhvm & php < 5.6 [1] wouldn't map that > character, since the ɽ -> Ɽ mapping has been present since unicode 5 > (2006). Guess it was using a really old unicode data or something. > > See also bug T219279 [2] > > -- > Brian > > [1] https://3v4l.org/GHt3b > [2] https://phabricator.wikimedia.org/T219279 > > On Sat, Aug 3, 2019 at 7:57 AM Nicolas Vervelle <nverve...@gmail.com> > wrote: > > > Hello, > > > > On most wikis, MediaWiki is configuration to convert the first letter of > a > > title to uppercase, but apparently it's not converting every Unicode > > characters : for example, on frwiki ɽ > > <https://fr.wikipedia.org/w/index.php?title=%C9%BD&redirect=no> is a > > different article than Ɽ <https://fr.wikipedia.org/wiki/%E2%B1%A4>, even > > if > > the second character is the uppercase version of the first one in > Unicode. > > > > So, what characters are actually converted to uppercase by the title > > normalization ? > > > > I need to know this information to stop reporting some false positives in > > WPCleaner <https://fr.wikipedia.org/wiki/Wikip%C3%A9dia:WPCleaner>. > > > > Thanks, Nico > > _______________________________________________ > > Wikitech-l mailing list > > Wikitech-l@lists.wikimedia.org > > https://lists.wikimedia.org/mailman/listinfo/wikitech-l > _______________________________________________ > Wikitech-l mailing list > Wikitech-l@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/wikitech-l _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l