gabriel-wmde added a comment.
After the comments from @JAnD and @Darkdadaah I have lots of questions on how disparate the page titles for each language are. As far as I understand, some Wiktionary projects have their own, localized spelling that deviates from the "standard" spelling of Wiktionary, for adhering to the spelling standards of the language they are in. Some questions: - Do wiktionary words exist in more than 2 variations of a word across all translations? If they do, how big is the variation? - Is the wiktionary project doing the localized spelling only for the language they are for (e.g. the French Wiktionnaire has French apostrophes in french words and phrases, but leaves the apostrophes of other languages words and phrases as-is). - Is the localized spelling consistent inside one wiktionary? - For which percentage (roughly) of localized pages does a redirect from the standard spelling to the localized spelling exist? - Would it be possible to "Normalize" page titles algorithmically (search and replace with regular expressions)? - Words in different scripts (Cyrillic, Hebrew, Arabic, etc.) don't have these issues? Or Do they have even graver issues? Is there someone who can reliably answer these questions for all wiktionaries? Otherwise the next step in this issue would be to write a program that analyzes the deviations. TASK DETAIL https://phabricator.wikimedia.org/T987 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: gabriel-wmde Cc: Bmueller, gabriel-wmde, Yair_rand, dg711, DannyH, StudiesWorld, JAnD, Aklapper, RobLa-WMF, Lydia_Pintscher, satdeep_gill, tarlocesilion, jberkel, mxn, PeterBowman, Liuxinyu970226, Darkdadaah, GPHemsley, Ricordisamoa, WebIntegrity, Avner, D3r1ck01, Alkamid, Izno, OrenBochman, Wikidata-bugs, Malyacko, aude, Mbch331, Jay8g, Krenair _______________________________________________ Wikidata-bugs mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
