https://bugzilla.wikimedia.org/show_bug.cgi?id=164
--- Comment #134 from Philippe Verdy <[email protected]> 2009-11-19 13:51:53 UTC --- Collation keys can coexist with custom sort keys. Custom sort keys are useful and will continue to be useful, to tweak the default collation (which for now is simply a binary ordering of codepoints). Tdoday, basically, the categories are sorted by [sort key, full page name]. (possibly truncated to a reasonable size using unique identifier). What we want is to be able to sort by: [collationkey([sort key, full page name]), sort key, full pagename) (also with possible truncation of the whole). This will preserve the existing tweaks made in pages when they reference categories, and will help sort all the rest. Ideally, the collation keys should be generated "on the wild" by the SQL engine itself (because it would allow alternate sort orders, according to user locale preferences or according to web query parameters set by GUI buttons, especially for Chinese where several sort orders are common: sort by Pinyin, sort by radical/strokes, sort by traditional dictionary orders), as part of its supported "ORDER BY" clause for getting the list of article names to display in categories. But if the SQL engine does not have such support, this must be implemented in the PHP code and collation keys can be stored in a new datacolumn (the extra data column can be added or filled conditionnally : if the SQL engine supports the needed collations, this column can remain NULL to save storage space). If the SQL engine does not have support for dynamic collations, then the alternate (user locale-based) sort orders will not easy to implement because of the cost that it would require in the SQL client-side (in PHP) for heavily populated categories, where the support for true locale-based collation orders is the most wanted, unless the database can store multiple collation keys (for distinct specific locales): supporting the storage of multiple collation keys for different locales can severaly impact the server performance as it would require an extra join to a separate 1:N table to store the collation keys indexed by (pageid, locale); instead of storing these keys in the same SQL table used for storing the category index. Additionally, the stored collation keys will sometimes need to be updated (when the CLDR data for locale-tailored collations will be updated or when there will be updated in the Unicode version with new characters): updating a large volume of stored collation keys will require a lot of work, and this can impact the availability of the wiki project, unless the data model includes a versioning system that allows at least two versions for the same locale to coexist for some time, and then allows switching from one version to the next before cleaning up the old collation keys after the collation keys have been updated to the new tailoring. -- Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug. _______________________________________________ Wikibugs-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikibugs-l
