https://bugzilla.wikimedia.org/show_bug.cgi?id=68490
Bug ID: 68490
Summary: Consider using content language for "html lang",
rather than interface language
Product: MediaWiki
Version: unspecified
Hardware: All
OS: All
Status: NEW
Severity: enhancement
Priority: Unprioritized
Component: Internationalization
Assignee: [email protected]
Reporter: [email protected]
CC: [email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected],
[email protected], [email protected]
Web browser: ---
Mobile Platform: ---
0) Open Chromium and, in one tab, chrome://translate-internals/#detection-logs
1) Visit https://fi.wikipedia.org/wiki/Opiskelijaraha?uselang=it and
https://fi.wikipedia.org/wiki/Wikipedia:Etusivu?uselang=it (or set interface
language to "it" and drop the uselang), save their HTML and alter '<html
lang="it"' to '<html lang="fi"', open the HTML in the same browser.
I. Expected: the article content is recognised as "fi" locale and appropriately
translated (or not) by Chromium according to my preferences.
II. Observed: the result is rather random depending on the amount of text in
content vs. interface, but in general the interface language prevails because
it's in the general html lang attribute. In the log something like this can be
seen:
[
{
"adopted_language": "it",
"cld_language": "it",
"content_language": "fi",
"html_root_language": "it",
"is_cld_reliable": true,
"time": 1406157710650.132,
"url": "https://fi.wikipedia.org/wiki/Opiskelijaraha?uselang=it"
},
{
"adopted_language": "und",
"cld_language": "it",
"content_language": "",
"html_root_language": "fi",
"is_cld_reliable": true,
"time": 1406158649658.846,
"url": "http://koti.kapsi.fi/~federico/tmp/Opiskelijaraha-it.html"
},
{
"adopted_language": "und",
"cld_language": "fi",
"content_language": "fi",
"html_root_language": "it",
"is_cld_reliable": true,
"time": 1406159280979.992,
"url": "https://fi.wikipedia.org/wiki/Wikipedia:Etusivu?uselang=it"
},
{
"adopted_language": "und",
"cld_language": "en",
"content_language": "",
"html_root_language": "fi",
"is_cld_reliable": true,
"time": 1406159369076.235,
"url": "http://koti.kapsi.fi/~federico/tmp/Etusivu-it.html"
}
]
See also https://code.google.com/p/chromium/issues/detail?id=254330#c6 and
https://code.google.com/p/chromium/issues/detail?id=95394#c20 ; per
https://code.google.com/p/chromium/issues/detail?id=95394#c6 it doesn't look
likely that Chromium will be able to recognise language of page fragments any
time soon.
Yet, we have some optimistic "lang" tagging like:
<html lang="it" dir="ltr" class="client-nojs">
<body class="mediawiki ltr sitedir-ltr ns-0 ns-subject page-Opiskelijaraha
skin-monobook action-view">
<div id="globalWrapper">
<div id="column-content">
<div id="content" class="mw-body-primary" role="main">
<div id="bodyContent" class="mw-body">
<div id="contentSub" lang="it" dir="ltr"></div>
<!-- start content -->
<div id="mw-content-text" lang="fi" dir="ltr"
class="mw-content-ltr">
PAGE IS HERE
</div>
</div>
</div>
</div>
<div id="column-one" lang="it" dir="ltr">
BASICALLY ALL INTERFACE
</div>
</body></html>
The interface is correctly tagged as "it" and the content as "fi", but they're
both wrapped in a (false) html lang which trumps them.
--
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
_______________________________________________
Wikibugs-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l