User "Bawolff" posted a comment on MediaWiki.r80443. Full URL: http://www.mediawiki.org/wiki/Special:Code/MediaWiki/80443#c18710 Commit summary:
* Introduced a non-dummy collation for $wgCategoryCollation, namely UCA with default tables. * Added a maintenance script which generates a list of first letters. Unified Han are omitted for performance, and because they shouldn't be used as headings anyway. A future collation specific to Chinese would provide the KangXi radicals as "first letters". * Provided a precomputed list of first letters. Used Unicode 6.0.0 data and ICU 4.2. * Moved collation functionality from Language to a Collation class hierarchy with factory function. Removed the recently-added methods from Language and updated all callers. * Changed Title::getCategorySortkey() to separate its parts with a line break instead of a null character. All collations supported by the intl extension ignore the null character, i.e. "ab" == "a\0b". It would have required a lot of hacking to make it work. * Fixed the uppercase collation to handle non-ASCII characters, redundantly with r80436. I don't think it's necessary to change the collation name as was done there, so I reverted that in the course of my conflict merge. A --force option to updateCollation.php might be nice though. Comment: I just happened to be looking at the Collation code today, I was wondering: Is there a reason to use an EN language object, instead of whatever the content language is for the Uppercase collation? This would make a difference on (for example) turkish wikis due to the whole i <-> İ thing. _______________________________________________ MediaWiki-CodeReview mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/mediawiki-codereview
