https://bugzilla.wikimedia.org/show_bug.cgi?id=43740

Bawolff (Brian Wolff) <bawolff...@gmail.com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
           Keywords|                            |patch-in-gerrit
            Summary|IcuCollation doesn't prune  |IcuCollation doesn't prune
                   |first letter elements that  |first letter elements that
                   |duplicate a prefix of       |duplicate a prefix of
                   |another first letter'       |another first letter's
                   |                            |sortkey

--- Comment #8 from Bawolff (Brian Wolff) <bawolff...@gmail.com> ---
Ok, I read up on icu, after quite a bit of googling, this actually looks not
that complicated. From what I gather (if I read the docs right, which is a very
big if), there should be no two primary collation elements where one collation
element in its entirety is a prefix of some other collation. See
https://ssl.icu-project.org/repos/icu/icuhtml/trunk/design/collation/ICU_collation_design.htm
specificly:
 R2. A fractional weight cannot exactly match the initial bytes of another
fractional weight at the same level.

So assuming nothing funky is done to compress the sort keys (which I don't
happens on the primary level, at least not currently), just looking for
matching prefixes should work.

Anyhow gerrit change 55503. I still need to double check that unsetting an
element of an array doesn't modify its sorted order (It doesn't seem to, but
should double check).

It also might be prettier if the check for duplicate prefix was merged into the
general duplicate check, but I didn't see an easy way of doing.

Anyhow, feedback appreciated.

-- 
You are receiving this mail because:
You are on the CC list for the bug.
You are the assignee for the bug.
You are watching all bug changes.
_______________________________________________
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l

Reply via email to