https://bugzilla.wikimedia.org/show_bug.cgi?id=2867


Amir E. Aharoni <[EMAIL PROTECTED]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |[EMAIL PROTECTED]




--- Comment #24 from Amir E. Aharoni <[EMAIL PROTECTED]>  2008-12-09 09:31:26 
UTC ---
The most logical default sorting is not phonetic, but Unicode.

Let me explain.

It doesn't actually make too much sense that Finnish (Suomi) would come after
Russian (Русский). It does make a little sense, because Cyrillic is
somewhat related to Latin - both have a letter for "R", although it looks
different. But what if there was a language which is written in the Cyrillic
alphabet, and its name begins with a "Ж"? It is transliterated into Latin as
"ZH", but a speaker of that language would find it odd if it appeared at the
end of the list, because in Cyrillic this letter is close to the beginning. So
sorting Русский near the Latin R's happens to make some sense, but it is
a lucky coincidence.

This problems occurs with Yiddish (יידיש): It is sorted near the end. Why?
Because Y is near the end of the Latin alphabet? But the Hebrew letter י is
near the beginning of the Hebrew alphabet in which Yiddish is written.

It makes even less sense that Hebrew (עברית) would come after Italian. It
is suggested that it Hebrew would come after Italian, because a simple
non-scientific translation of עברית is "Ivrit". The reality, however, is
that Hebrew speakers don't think that the first letter of their language's name
is an "I", but an "ע" (Ayin), which has no analog in the Latin alphabet;
hence, there is no clever way to put עברית in a "phonetic order".

These are just a few of the problems with languages with which i am familiar. I
don't know, for example, how convenient it is for a Japanese speaker to find
his language at N (for Nihongo, i presume).

The only solution to this is to make the default language names adhere to the
order of the scripts in Unicode. This means that language names will be grouped
by script: Latin (French, Ban-lam-gu, Estonian), Cyrillic (Russian, Mongol,
Sakha), Arabic (Arabic, Farsi, Urdu), Hebrew (Hebrew, Yiddish), Chinese
(Mandarin, Cantonese, Yue), Devanagari (Hindi, Nepali) etc. These groups will
appear in the order in which they appear in the Unicode standard. It is
technocratic, but it is the most neutral way i can think of. Certainly better
than putting עברית under I, which is not useful for Hebrew speakers.

And for the record - i support the option to have a language project define
languages that will appear at the top. De facto, for Norwegian it's Swedish,
Danish et al., for Hungarian and Hebrew it is English etc., and nothing is
wrong about it. It makes Wikipedia convenient.


-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
You are watching all bug changes.
_______________________________________________
Wikibugs-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l

Reply via email to