https://bugzilla.wikimedia.org/show_bug.cgi?id=8732

--- Comment #9 from Philippe Verdy <[email protected]> 2010-04-14 12:11:49 UTC 
---
Anyway, the effort is not so dramatic (at least for Latin languages, and even
for Cyrillic or Greek).

It is however dramatic for Hebrew, Arabic, Chinese or Korean, where sort keys
are extremely difficult to create or infer correctly and because they need to
be specified absolutely everywhere: the simple binary order of Unicode code
point values means absolutely nothing for these scripts. This should be
automated as much as possible.

A builtin parserfunction for computing collation keys is just the start of the
surface but this effort will pay. In the MediaWiki software, this means
integrating the open-sourced ICU library and its now standardized interface
layer to PHP.

Other solutions, based on the underlying SQL engine will not work as
universally (in addition, this causes severe mainernance problems for the SQL
engine : let's keep the binary sort order in SQL, and allow instad the
preparation of separate indexes for the same categories : it will be
SQL-agnostic, and will work with various SQL engines that can already be used
with MediaWiki, not just MySQL whose Unicode support is minimalist and really
not portable).

This situation will persist as long as there's no international and
vendor-neutral ISO standard for SQL engines, and support for this standard in
all major SQL engines, and with compatible collation data across SQL engines,
which can also create colaltion keys and order consistant with PHP server-side
or may be client-side implementations (much later, if a similar standard is
adopted in ECMAScript).

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
You are on the CC list for the bug.

_______________________________________________
Wikibugs-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l

Reply via email to