Lucas_Werkmeister_WMDE added a comment.
> Let me see if there’s a more restrictive Unicode category we can use. Not really – ZWJ/ZWNJ are in Other, format (Cf) <https://www.fileformat.info/info/unicode/category/Cf/list.htm> together with the directional control characters (U+202E RIGHT-TO-LEFT OVERRIDE and friends), which I don’t think we want to allow in decoded form. MediaWiki core’s MediaWikiTitleCodec::splitTitleString() <https://gerrit.wikimedia.org/g/mediawiki/core/+/3cc288eac4/includes/title/MediaWikiTitleCodec.php#369> hard-codes the bidi characters as forbidden: U+200E-F and U+202A-E. I guess we could do the same, and re-encode those seven while allowing the rest of the `Cf` category? (But still blocking the other “other” categories: `Cc` Other, control; `Cs` Other, surrogate; `Co` Other, private use; and `Cn`, Other, not assigned.) (MediaWiki //allows// the bidi //isolate// characters in titles, and indeed U+2066 <https://en.wikipedia.org/wiki/> is a working redirect on enwiki. I’m not sure how I feel about that tbh.) TASK DETAIL https://phabricator.wikimedia.org/T327514 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: Lucas_Werkmeister_WMDE Cc: ItamarWMDE, Aklapper, Arian_Bozorg, Nikki, Sarai-WMDE, Astuthiodit_1, AWesterinen, karapayneWMDE, Invadibot, MPhamWMF, maantietaja, CBogen, Akuckartz, Nandana, Namenlos314, Lahi, Gq86, Lucas_Werkmeister_WMDE, GoranSMilovanovic, Mahir256, QZanden, EBjune, merbst, LawExplorer, Salgo60, _jensen, rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, Lydia_Pintscher, Mbch331
_______________________________________________ Wikidata-bugs mailing list -- [email protected] To unsubscribe send an email to [email protected]
