Re: [Wikidata] Label gaps on Wikidata

Daniel Kinzler Mon, 27 Feb 2017 07:55:50 -0800

Am 19.02.2017 um 17:00 schrieb Romaine Wiki:
> Hi all,
> 
> If you look in the recent changes, most items have labels in English and those
> are shown in the recent changes and elsewhere (so we know what the item is 
> about
> without opening first).


Wikidata actually tries to show you the labels in your üpreferred interface
language. And if you user language is not available, it uses a fallback
mechanism to show the next-best language, which may even include automated
transciptions. When all else fails, it will show the English label. If that
doesn't exist, it shows the ID.

> But not all items have labels, and these items without
> English label are often items with only a label in Chinese, Arabic, Cyrillic
> script, Hebrew, etc. This forms a significant gap.

The fallback mechanism works OK, but is not great for English speaking users who
see a lot of items that have no English label. For English, we just don't know
what to fall back to. Just anything? Or try european languages first? What
should the rule be? If we can decide on a good rule, it should actualyl be
pretty simple to add such fallback for English.

> Is there a way to easily make a transcription from one language to another?

We have such rules for some languages/variants, e.g. between the cyrillic and
the roman representations of Kazakh or Uzbek. But translitteration rules can be
complex, and covering every permutation of the 300 languages we support would
mean we'd need about 45000 rule sets...

> Or alternatively if there is a database that has such transcriptions?

Not yet. One of the goals of Wikidata is to be that database.

-- 
Daniel Kinzler
Principal Platform Engineer

Wikimedia Deutschland
Gesellschaft zur Förderung Freien Wissens e.V.

_______________________________________________
Wikidata mailing list
Wikidata@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata

Re: [Wikidata] Label gaps on Wikidata

Reply via email to