https://bugzilla.wikimedia.org/show_bug.cgi?id=30675
--- Comment #4 from Philippe Verdy <[email protected]> 2011-09-12 01:37:15 UTC --- The the CLDR-modified DUCET basically changes only the relative order of primary weights. But yes it includes some notable differences for things like currency symbols. In the CLDR version, the Rupee sign will no longer sort with Latin letters, meaning that it will no longer be decomposed and that its first primary weight will now be distinct from the primary weight given to Latin letter R. This also means that the "first letter" will need to be made different. To implement the "first letter", what you need is to do it consistantly with the collation order, so the Rupee sign will need to be changed to use the Rupee sign itself as the "first letter", instead of latin small letter r. You can infer the "first letter" from the DUCET, by looking at the first collation element that has the same primary weight and the smallest weights for the next levels. But to get a fully ordered list, necessary to make such determination, you first need to decide what to do with variable elements: should they all sort with primary weights, or as ignorables. Because this changes radically the ordered sequence of collation elements and which "first letter" you'll get (note that variable elements to not interleave in the DUCET, at least for the first primary weight when they are expansions, but this is not necessarily the case with locale-specific tailorings). One example: U+0060 (the ASCII "GRACE ACCENT") has a possible tailored decomposition as SPACE+COMBINING GRAVE ACCENT, in which case it would sort with SPACE, with only a secondary difference of accent (then, using an expansion). In that case, its "first letter" would become the SPACE, and not itself. There are more complex cases of "variable collation elements" that need special handling in tailorings, for "Modifier Letters", or for Hebrew and Tibetan "cantillation marks", or for Braille patterns. For these cases, you must be extremely careful about how you compute the "first letter", or it will be completely out of sync of the collation order. -- Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are the assignee for the bug. You are on the CC list for the bug. _______________________________________________ Wikibugs-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikibugs-l
