[Wikidata-bugs] [Maniphest] T256649: incorrect English names for languages (they display the native names only)

Trappist_the_monk Sat, 07 Aug 2021 12:57:08 -0700

Trappist_the_monk added a comment.

  In the week or so since my post (T256649#7246452 
<https://phabricator.wikimedia.org/T256649#7246452>), the four codes mentioned 
have been fixed at en.wiki.

  At en.wik, a subset of those mentioned in the OP:

  - [gsw] = "Alemannisch" – should be "Alemannic" in English
  - [sty] = "себертатар" – must be "Northern Tatar" in English
  - [vo] = "Volapük" – should probably be "Volapuk" in English (without the 
combining diaeresis)
  - [vro] = [fiu-vro]= "Võro" – should probably be "Voro" in English (without 
the combining tilde)

  are different from OP's suggestions:

  - [gsw] = Swiss German
  - [sty] = Siberian Tatar – note that OP says '**must be** "Northern Tatar"' 
(emphasis added)
  - [vo] = Volapük – not changed counter to OP's suggestion
  - [fiu-vro] = võro – why lowercase when:
    - [vro] = Võro (also not changed counter to OP's suggestion)? but see  
T256649#7160228 <https://phabricator.wikimedia.org/T256649#7160228>

  Except for the Võro / võro capitalization discrepancy, I have no objections 
to the name choices that differ from OP's suggestions.

  In T256649#6928587 <https://phabricator.wikimedia.org/T256649#6928587>, 
@Trappist_the_monk wrote:

  > Another oddity that we have come across is `{{#language:he|am}}` which is 
terminated with U+FEFF zero width no-break space.
  >
  > Also, is there a reason that some language names use U+0027 apostrophe 
(O'odham is one) and other language names use U+2019 right single quotation 
mark (Cànan Hawai’i is one – but shouldn't the ʻokina be U+02BB modifier letter 
turned comma?)

  The above have not been answered.  In that post I neglected to mention that 
Cànan Hawai’i ← `{{#language:haw|gd}}` and O'odham ← `{{#language:ood|en}}`.  
There are about 570 language-code / target-language-code pairs that render the 
language name with U+2019 right single quotation mark and about 1770 pairs that 
render the language name with U+0027 apostrophe (no doubt many of these in both 
groups are fall-backs to some other target language).  At en.wiki, only `nqo` 
renders a language name with U+2019 right single quotation mark: N’Ko ← 
`{{#language:nqo|en}}`.   For code `nqo`, the ISO 639 custodians use U+0027 
apostrophe for 'N'ko' (ISO 639-3 and ISO 639-2 English) and for 'n'ko' (ISO 
639-2 French).  IANA, in the language-subtag-registry file, supports both 
'N'ko' (U+0027 apostrophe) and 'N’Ko' (U+2019 right single quotation mark).

  Still, shouldn't the U+2019 right single quotation mark be replaced with the 
U+0027 apostrophe in cases where U+2019 does not have special meaning (glottal 
stops or whatever)?  And where  U+2019 right single quotation mark is used in 
place of special characters like ʻokina (Hawaiʻian and other languages), 
shouldn't U+2019 right single quotation mark be replaced with U+02BB modifier 
letter turned comma or other appropriate character?

TASK DETAIL
  https://phabricator.wikimedia.org/T256649

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Trappist_the_monk
Cc: Michael, Lucas_Werkmeister_WMDE, Nikerabbit, Esc3300, Huji, 
Trappist_the_monk, Aklapper, Verdy_p, Invadibot, maantietaja, Akuckartz, 
Nandana, Lahi, Gq86, Af420, GoranSMilovanovic, Mahir256, QZanden, LawExplorer, 
_jensen, rosalieper, Scott_WUaS, MuhammadShuaib, LNDDYL, Nikki, Psychoslave, 
Wikidata-bugs, aude, Nemo_bis, Raymond, Arrbee, KartikMistry, Mbch331, Jay8g

_______________________________________________
Wikidata-bugs mailing list -- [email protected]
To unsubscribe send an email to [email protected]

[Wikidata-bugs] [Maniphest] T256649: incorrect English names for languages (they display the native names only)

Reply via email to