Addshore added a comment.

For reference: Langlink-entries not matching the page title, from the main namespace of de, en, and fr Wiktionary.

F4513480: wiktionary-langlink-mismatch.zip

So the following library seems to be very useful here
https://github.com/Behat/Transliterator
Looking at the lists linked above this results in 2224/3238 normalized

		$string = Behat\Transliterator\Transliterator::transliterate( $string );
		$string = str_replace( '-', '', $string );

Methods in core just don't seem to cut it.
List of cases missed can be seen at P4120


TASK DETAIL
https://phabricator.wikimedia.org/T145412

EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Addshore
Cc: gerritbot, Darkdadaah, WMDE-leszek, Lydia_Pintscher, gabriel-wmde, JAnD, daniel, Addshore, Aklapper, Lewizho99, Maathavan, D3r1ck01, Izno, Wikidata-bugs, aude, Mbch331
_______________________________________________
Wikidata-bugs mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs

Reply via email to