The recent elections showed us that language issues and translation
are something we have to take very seriously from now on.  As a first
step towards improving communication, it seems like we should get an
idea of which users speak which languages?

We could directly ask them to tell us, but upon reflection, the
information is already hidden in our database.  A multilingual user is
one that actively edits two projects of different languages.

In devising a comprehensive translation strategy, we need to know how
interconnected any two given projects are.   We also need to know how
connected any given project is to English, since it's our working
language.

We need to pay special attention to languages that are very 'distant'
from English-- distant in the sense of having few members who fluent
in both English and the language in question.

Could someone aid me in getting this data, or explaining why I don't
need it or why we already have it, etc?

Specifically, I'm looking for:
#   For each non-english-language project, how many of their active
users are ALSO active on an english-language project? (the answer is
should be a single whole number for each project)
#   For any two projects, how many users are there who are active on
both? (answer is a square matrix, roughly 750x750 )
#   For any two languages, how many users appear to speak both
languages? (answer is a square matrix, roughly 750x750)

Does anyone know how to pull this out of the database?    It's an
important question for us to recruit translators and really just
assess "where we are" in terms of inter-project language capabilities.

Alec

_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to