Awesome work! It's interesting to see Finnish as the outlier here. Do we have any fi-users on the list who can comment on this and might know what's going on? (And, in the absence of Finns: Jan, heard anything from across the border? :p)
The only caution I'd raise is that these numbers don't include spider filtering. Why is this important? Well, a lot of traffic is driven by crawlers and spiders and automata, particularly on smaller projects, and it can lead to weirdness as a result. With the granular pagecount files there's some work that can be done to detect this (for example, using burst detection and a few heuristics around concentration measures to eliminate pages that are clearly driven by automated traffic - see the recent analytics mailing list thread) but only some. I appreciate this is a flaw in the data we are releasing, not in your work, which is an excellent read and highly interesting :). I agree that understanding the lack of development in the PRC and ROK is crucial - we keep talking about the "next billion readers" but only talking :( On 16 March 2015 at 02:21, h <hant...@gmail.com> wrote: > Dear all, > > I have some findings to show the page views per Internet user > measurement may help comparing different language editions of Wikipedia. > Criticism and suggestions are welcome. > > > ----- > http://people.oii.ox.ac.uk/hanteng/2015/03/15/comparing-language-development-in-wikipedia-in-terms-of-page-views-per-internet-users/ > > Which language version of Wikipedia enjoys the most page views per language > Internet user than expected? It is Finnish. In terms of absolute positive > and negative gap, English has the widest positive gap whereas Chinese has > the largest negative gap. > > ...... > > In particular, it is known that Wikipedia (and Google which often favours > Wikipedia) faces local competition in the People's Republic of China and > South Korea. Therefore it is understandable the page views may be lower in > Chinese and Korean Wikipedia language projects simply because some users' > need to read user-generated encyclopedias are satisfied by other websites. > However, it remains an important question to examine why these particular > Latin and Asian languages are under-developed for Wikipedia projects. > > _______________________________________________ > Wiki-research-l mailing list > Wiki-research-l@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/wiki-research-l > -- Oliver Keyes Research Analyst Wikimedia Foundation _______________________________________________ Wiki-research-l mailing list Wiki-research-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wiki-research-l