2013/1/22 Ravishankar <ravidre...@gmail.com>

> Arjuna,
> Your approach to have an idea of a community's activity by combining some
> metrics is interesting. Also, to compare monthly averages instead of end of
> year performance is also good.
One clarification.
I took the sum of the metrics for the entire year rather than average.

> *Increase in database size, number of most active contributors (making
> more than 100 edits a month) and page views can be real indicators of
> growth and activity*. Is there any way to find database size? We don't
> have that and many other useful stats after May 2010. Check
> http://stats.wikimedia.org/EN/TablesWikipediaTA.htm for example.

Yes and No, as any metric can be compromised. If database size is a
parameter, too many stub articles may adversely affect the quality.

Number of most active contributors is too small a number in most Indian
language Wikipedias. There has been debate about the % of people who
contribute to Wikipedia in English. While Jimmy is reported to have said
that it is small fraction which contributes the bulk, Aron Swartz did an
analysis that the bulk of contribution is made by large number of
wikipedians based on the bytes added metric. In wikipedia, I think the
former is more true, though it could be different if there are people who
are contributing stubs rather than reasonable sized articles.

You can get an account on Toolserver and obtain the database size

> Also, any inference based on page views should normalize it based on the
> population of native speaking people to have a better perspective of the
> community's performance compared to its potential / size.

My analysis is more focussed on relative change in each language than
absolute numbers. The graphs included  the information for all the
languages to get a general feel of the languages together.

It is not the population alone which can be a significant factor, as there
are several other factors like the love for language and geographic
distribution of its speakers. Population numbers could be considered, but
then there is  lot of variation depending on the sources.

An exhaustive study would be  useful, if we can determine a independent
measure (may be by annual survey) as an impact of Wikipedia in
accomplishing its mission, then the various metrics  available like edits,
page views and their combinations  can be tried using statistical
algorithms to arrive at key parameters or parameter combinations.

I think it would be good for WMF to work on such a thing, as otherwise our
focus could be lost by looking at simple metrics like active editor/page

