Illario, Latin doesn't have L1 speakers. And data about languages are such
a mess, that I would stick with Ethnologue's data for L1 speakers, although
they are not reliable. Ethnologue counts "there are 100,000 speakers of
language X in country A and 34 in country B, thus there are 100,034
speakers in total" (although likely error margin for the first number is
150 times larger than the second number), as well as it has numerous other
flaws, like fringe "macrolanguage" category is. However, besides counting
the same way, English Wikipedia has much worse failures when we leave ~50
major languages safety, if not based on Ethnologue's data. (It's mostly
about wishful thinking of ethnic nationalists and chronic lack of manpower
to fix that bullshit promptly.)

Nemo, yes I was thinking about various data instead of article count and
GDP/PPP per capita, so here are the thoughts, including those two

* Article count per speaker gives one one nice pseudo-hyperbolic curve.
Basically, you can see a hyperbolic curve by drawing the line over the
highest points: Hawaiian-Upper Sorbian-Basque-Swedish-Dutch-English. By
normalizing the numbers, we could get targets per language.

* However, edit count seems like better idea. I think, but it has to be
proved, that such numbers won't have to be adjusted for the number of
speakers themselves.

* We could count various numbers related to users. For example, it seems
that as smaller ratio between the number of active and very active users
is, as healthier community is. Also, number of editors per million of
speaker per GDP or HDI could be useful parameter.

* I was thinking yesterday about HDI. But then I've realized that it would
be good to create all of possibly relevant charts and see what they bring
as information. I am interested in comparison of Wikipedia stats with Gini
coefficient, for example.

And I will do that. After I finish with the most frustrating part of the
job: draw the line between Wikipedia editions, Ethnologue data and actual
languages. Good news is that I am on ~150th of ~280 Wikipedia editions and
it's likely I will finish it during the next week. (After almost eight
years of dealing with this matter, whenever someone says that there are two
hundred eighty something Wikipedia languages or that there are 7000
languages in the world, I reach for my revolver.)
 On Jun 12, 2015 20:51, "Federico Leva (Nemo)" <> wrote:

> Milos Rancic, 08/06/2015 00:23:
>> And I suppose somebody with statistical knowledge would be able to
>> give us the number which would have meaning "ability to create
>> Wikipedia article".
> Why not use the human development index (HDI) as factor? Also, instead of
> the number of articles I'd rather use database size or number of words.
> Nemo
> _______________________________________________
> Languages mailing list
Wikimedia-l mailing list, guidelines at:

Reply via email to