On Tue, May 12, 2009 at 4:38 PM, Brion Vibber <[email protected]> wrote:
> * Collation use for sorting needs to be double-checked to confirm it
> wouldn't interfere with present uniqueness constraints

Since cl_sortkey isn't part of any unique key, this appears not to be
an issue for this use.  Of course, it's an issue for every other
sorted list of titles, but those can't have custom sort keys specified
to begin with and don't seem to be included in this proposal.  Perhaps
they should be, though.  In that case we'd probably end up needing an
extra column in every single table that includes the page title, just
for sorting (but we'd be able to use flexible algorithms to generate
the sort key, rather than being stuck with MySQL's).

> * Multilingual sites possibly not well served by table-wide
> language-specific coding

utf8 sorting would be a lot better than binary sorting for any site,
I'm pretty sure.  (I assume utf8 sorts sanely and not according to
codepoint.)

> Doing our own localized sort key encoding and adding another indexed
> column to sort on would avoid some dependency issues but has its own
> deployment and maintenance difficulties.

You don't need another column for categorylinks, you can use the
existing cl_sortkey, so that should be relatively easy to deploy.  It
doesn't help with non-category use cases, of course.

> It would also be possible to use a separate column for the collated
> sorting while using MySQL 4.1+'s native collations, if the uniqueness
> constraints are a problem, but this is still dependent on rolling out an
> upgrade from 4.0.

In that case we may as well make it like cl_sortkey and populate it
ourselves, surely.

_______________________________________________
Wikitech-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to