Hey, the way non-Latin characters are displayed in section has always been a serious complaint from our communities: https://phabricator.wikimedia.org/T152540
Community tech has done some work in this area and it's ready to get more eyeballs: https://gerrit.wikimedia.org/r/#/c/362326/ A few words about implementation plan: * There is now a concept of primary vs. fallback IDs. Primary are used for linking, fallbacks are used so that old links still work. * To transition to the new system, a wiki should first continue serving legacy-encoded sections with new encoding as a fallback, then switch the two after all older parser/HTTP caches have been filled with new HTML. Legacy encoding should remain enabled as long as there is a noticeable traffic using it, on WMF sites that probably means years. * By default, MediaWiki will still behave exactly like before. Changing the defaults to something more modern will be discussed later, after all the initial issues are resolved. * Because it's being used without escaping in so many places outside of core and because there is now a fine distinction between ID escaping for different purposes, Sanitizer::escapeId() is deprecated. It will never output new encoding and should be replaced with one of escapeIdForHtml(), escapeIdForLink() or escapeIdForExternalInterwiki() AFTER making sure it's getting properly escaped. Your help reviewing/testing/discussing this is highly appreciated! -- Best regards, Max Semenik ([[User:MaxSem]]) _______________________________________________ Wikitech-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikitech-l
