https://bugzilla.wikimedia.org/show_bug.cgi?id=30675
Santhosh Thottingal <[email protected]> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |[email protected] | |om --- Comment #1 from Santhosh Thottingal <[email protected]> 2011-09-01 05:27:35 UTC --- More info: Unicode defined Default Unicode Collation Element Table and asks vendors/application developers to "tailor" it to meet the exact requirements. See http://www.unicode.org/reports/tr10/tr10-23.html#Tailoring One such tailoring is CLDR collation data and apparently it is more accurate for many languages than DUCET. In mediawiki, recently User:Simetrical added DUCET based collation data generation. see http://www.mediawiki.org/wiki/User:Simetrical/Collation But that code uses DUCET and does not use CLDR tailored DUCET. see maintenance/language/generateCollationData.php It uses http://www.unicode.org/Public/UCA/latest/allkeys.txt Unicode provides an alternate version of allkeys.txt named allkeys_CLDR.txt. see http://unicode.org/Public/UCA/6.0.0/CollationAuxiliary.html The variations depend on the language. We will require a close look into this data set to see the differences and whether it can make collation more accurate. -- Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug. You are on the CC list for the bug. _______________________________________________ Wikibugs-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikibugs-l
