| Addshore added a comment. |
In T145412#2659357, @daniel wrote:For reference: Langlink-entries not matching the page title, from the main namespace of de, en, and fr Wiktionary.
So the following library seems to be very useful here
https://github.com/Behat/Transliterator
Looking at the lists linked above this results in 2224/3238 normalized
$string = Behat\Transliterator\Transliterator::transliterate( $string ); $string = str_replace( '-', '', $string );
Methods in core just don't seem to cut it.
List of cases missed can be seen at P4120
TASK DETAIL
EMAIL PREFERENCES
To: Addshore
Cc: gerritbot, Darkdadaah, WMDE-leszek, Lydia_Pintscher, gabriel-wmde, JAnD, daniel, Addshore, Aklapper, Lewizho99, Maathavan, D3r1ck01, Izno, Wikidata-bugs, aude, Mbch331
Cc: gerritbot, Darkdadaah, WMDE-leszek, Lydia_Pintscher, gabriel-wmde, JAnD, daniel, Addshore, Aklapper, Lewizho99, Maathavan, D3r1ck01, Izno, Wikidata-bugs, aude, Mbch331
_______________________________________________ Wikidata-bugs mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
