Hi Kai,

You should start with Ethnologue's country data; this website provides the
most comprehensive data. But, be aware that the data may not be updated. so
compare it with Endangered Language Project data
<https://endangeredlanguages.com/> and UNESCO's World Atlas of Language
<https://en.wal.unesco.org/>; in the case of my country, Indonesia, the
power dynamics around the national language, Indonesian, and Indigenous
(local) languages lead to language shifting to Indonesian, or major lingua
franca in each region, such as Makassar Malay in the greater South
Sulawesi, etc, and it is hard to exactly calculate the current number since
the latest official population census is lack of awareness in language
diversity as well.

Hope this helps.

Best,
Biyanto


On Fri, Jun 7, 2024 at 7:30 AM Kai Zhu <[email protected]> wrote:

> Dear all,
>
> I am currently undertaking a research project that explores the choice of
> language when reading Wikipedia across different countries. One of the
> tasks of my study involves mapping Wikipedia languages to the countries
> where these languages are predominantly spoken. I recognize the complexity
> of this task and understand that a perfect mapping might not be possible.
> However, I would appreciate any recommendations on the best methodologies,
> practices, or data sources for accomplishing this.
>
> Additionally, I have a related question: What are good data sources for
> information regarding the proportion of a country's population that speaks
> various languages?
>
> Thank you for your help and insights.
>
> Best regards,
> Kai Zhu
> Assistant Professor
> Bocconi University
> _______________________________________________
> Wiki-research-l mailing list -- [email protected]
> To unsubscribe send an email to [email protected]
>
_______________________________________________
Wiki-research-l mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to