GoranSMilovanovic added a comment.

  @Manuel
  
  Here is a concise report that relies on UNESCO Language Status 
<http://www.unesco.org/languages-atlas/>:
  
  F34547890: Wikidata_LanguageStatusReport.nb.html 
<https://phabricator.wikimedia.org/F34547890>
  
  The analyses presented here can be completely replicated using the Ethnologue 
language status <https://www.ethnologue.com/about/language-status> categories 
as well. Please let me know if you find that necessary or interesting - I have 
opted for UNESCO language status simply because I thought it would be good to 
use one criterion - if if we choose it ad hoc - in comparison to a more 
complicated situation where we use two criteria (UNESCO and Ethnologue).
  
  From my perspective, the most important insights are:
  
  - Languages that are not endangered are way better represented than the 
endangered or vulnerable languages in terms of how many sitelinks they have; 
this is probably more relevant for the Wikipedia community than for us, 
however, I thought we should help by informing them when we already have the 
numbers at our hands;
  - Languages that are not endangered have many more labels in Wikidata in 
comparison to languages that are endangered or vulnerable;
  - Beyond that, languages that are not endangered in general label items that 
are more reused across the Wikimedia projects in comparison to the items for 
which we have labels in endangered or vulnerable languages.
  
  I have used visualizations, labeling languages by their respective code, in 
order to single out the extremes on the following indicators:
  
  - number of sitelinks
  - number of items for which a particular languages has labels for
  - the reuse of items labeled by a particular language.
  
  In conjunction with the tables - all of them are provided in the report - 
that might helps us to figure out if there are specific linguistic communities 
that we could address and see if they need any help.
  
  The analysis is exploratory: I did not want to invest any time in statistical 
hypothesis testing (e.g. comparisons across groups or languages + decision 
making on whether the differences are statistically significant or not) before 
we can have a glimpse of the big picture at least.
  
  Please let me know if anything needs further clarification; I am open for a 
1:1 on this until Wednesday 14. July late CET hours.

TASK DETAIL
  https://phabricator.wikimedia.org/T286257

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: GoranSMilovanovic
Cc: Tobi_WMDE_SW, Manuel, GoranSMilovanovic, Aklapper, Invadibot, maantietaja, 
Akuckartz, Nandana, Lahi, Gq86, QZanden, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, Mbch331
_______________________________________________
Wikidata-bugs mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to