https://bugzilla.wikimedia.org/show_bug.cgi?id=164
--- Comment #154 from Philippe Verdy <[email protected]> 2009-11-25 23:06:01 UTC --- That's another good resson why collation should be supported directly within the MediaWiki software, which already depends completely of PHP, so that it should really use the best integration as possible using the dedicated ICU integration module for PHP. This also means that it will be much better to simply store the computed collation keys directly within the database schema, unless the database adapter for PHP already supports ICU (and there's a commitment from the database vendor to integrate ICU as part of its design). The Todo:ICU in PostgreSQL will be fine when it will be effectively implemented, and this integration becomes fully supported. But anyway, the first thing to do is to press the PHP developers to have their own commitment to offer full support and integration of ICU within PHP, and if this is still nto the case, making sure that the ICU integration module for PHP comes from a viable project (otherwise there will be the need, in MediaWiki, to develop an adaptation layer for collation, that will support transparent change for another PHP integration module, or a later integrtion within PHP core itself). The second thing to look for (and that is still missing) is a support for a ICU-like project (or port) for Javascript (for integration on the client-side, in the browser) with here also an Javascript-written adapter layer, that allows replacement of the Javascript-written collator by some future API supported natively by browsers (because it will perform much better). The best integration tools (for client-side collation) that I have seen, using Javascript, fully depends on AJAX (i.e. with collaboration with serverside-scripts that can provide precomputed collation data, or that can compute the collation keys from the client-provided texts): some interesting demos use JSON requests or XML requests though AJAX, but this adds some delays and increases the number of HTTP requests needed to sort lots of client-side data (for example when sorting the rendered HTML table columns, which currently just uses the Javascript "localeCompare" function which seems to use only the DUCET or some locale-neutral collation, without taking into account the actual locale). It would be much better if all major browser engines (for IE, Mozilla for Firefox, Wekbit for Safari/Chrome/KHTML) decided to extend the very poor support of Unicode and locales within Javascript/ECMAScript strings, using ICU as a base foundation or at least for the services API that it can implement (even if those browsers use different integration strategies): they should still support the same collation rules, with the same syntax in a similar language, such as the languages already documented in the Unicode standard, including the possibility to use the collation tailoring data already coming the CLDR project, and the possibility for these implementations to still support user-specified tailorings (so without hardcoding them in a way that would completely depend on the implemented Unicode version and the limited list of locales already supported by CLDR). There are two standard languages defined for collation tailorings : one is XML-based but is extremely verbose (it is probably easier to use from a DOM-based object view, and most probably more efficient at run-time), another equivalent one is much more compact and more readable and much easier to specify by users or in scripts. Both syntaxes are automatically and easily convertible between each other, with equivalent compilation times and complexities, but the comapct form is easier to transmit in small scripts over HTTP (including through AJAX), and the compact form is much faster to parse as it won't depend on a ressource-hungry XML parser (due to its required and complex conformance rules). For Javascript clients, a JSON-based syntax for collation tailorings may also be even more efficient without requiring additional complex code written in Javascript itself. -- Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug. _______________________________________________ Wikibugs-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikibugs-l
