https://bugzilla.wikimedia.org/show_bug.cgi?id=19190
Philippe Verdy <[email protected]> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |[email protected] --- Comment #12 from Philippe Verdy <[email protected]> 2010-10-10 04:59:41 UTC --- "In the case PLURAL: we probably have to modify the way that is calculated on the server via having a array representation ie( array('1-4':X, '5':Y, '6-11':Z) (instead of having a php function with switch statements)" Please note that some messages contain (and do need) several occurences of PLURAL: with DISTINCT numeric values. Such messages are not necessarily splittable into distinct ressources (due to language grammars, or the meaning of translated FULL sentences, which may require reordering some items). If all what Javascript has to do is to parse PLURAL: items, with their $n parameters, I think this is not complicate to implement, because such parsing will be extremely simple, provided that there's a policy about its presence and encoding in messages. So it will just look like this basic regexp that all javascript engines will handle correctly: /\{\{PLURAL:\$([0-9]+)\|([^}]*)\}\}/ The only restriction being that the part between the pipe and the first closing brace should not contain any wiki markup, or some characters like newlines, or pipe or brace character (however these characters may be transmitted by the server as numeric entities, if they are really present in the source wiki or PHP code). Such policy is enforcable in translatable resources sent and received to Translate.net, or by correct documentation of the messages to translate. Then the content of $2 (the texts between the first pipe and first closing brace, should be splittable immediately on the pipe character into a basic array. The difficulty will be to implement the plural rules according to locale (which value is a plural, and how many forms are needed : consider singular, plural, dual, few, many, other...): how many locales does the japascript to support? Can these rules be encoded in a way that Javascript can handle the plural rules correctly for all locales supported by the Wiki ? The same could be used for GENDER. The wiki cal also provide to the Javascript the appropriate external data, along with the message, as additional properties, without forcing the javascript to perform another AJAX or JSON query to the server each time it detects messages containing a GENDER or PLURAL subsitution function. Another difficulty comes with plural forms that are causing change of grammatical case (notably in Slavic languages), and which also depends on how sentences containing these conditional plural forms are created: other parts of the sentences may need to be changed. But it's impossible to predic which part will be affected (notably if there are several GENDER or PLURAL occurences in the message). Should we consider GRAMMAR ? Probably not for Javascript. Finally there's the problem of wikis that use: - multiple scripts (including Chinese for converting the simplified vs. traditional ideographs). This requires a complex script to correctly handle the dynamic message formatting (or character substitutions). - RTL scripts (Hebrew, Arabic, ...), because they are in fact using a mix of scripts. The correct rendering of formated messags often requires specific Bidi control for embedding some variable items (this is really complex in the presence of BiDi-neutral or weak characters, notably for final common punctuations ; for example in a RTL wiki, a message that starts or end by Latin letters (possibly in the variable part of the sentence) will cause these characters without strong directionality to be displayed at the wrong place, or the whole sentence may appear broken or reordered, creating confusion. Currently, MediaWiki does not handle BiDi gracefully, and offers no easy way to support correct BiDi embedding of variable elements in the middle of a sentence, and no easy way to restore the default directionality after this variable part. Unicode offers BiDi controls, but they are NOT recommanded in HTML, which should use <element dir=""> overrides, or CSS bidi properties. The solution seems simple but it is not: before and after the variable parts of the message in the same HTML block element, there needs to be some <span dir=""> element to embed the static parts, but most ressources are not prepared this way: this has to be done in Translatewiki.net when translating those resources with variable positions whose content directionaly is ambiguous, variable or unknown — for example user names, page names, native foreign language names from {{language:}}, or ressources autotranslated via {{int:}} : this affects in fact all wikis, including in English, not just those wikis with a default RTL locale. -- Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug. _______________________________________________ Wikibugs-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikibugs-l
