On 7/27/13 10:29 AM, Denny Vrandečić wrote:
I still would worry, though: our content is increasing linearly, as you
say, but the number of active contributors is not. If we take for granted
that active contributors are the ones who provide quality control for the
articles, this means that since 2006 or so the ratio of content per
contributor is linearly declining, which would mean that our quality would
suffer.


One useful bit of information is what *kind* of editors there are, not just the raw numbers..

For example, here is a hypothetical situation, which I think James and John are contemplating, which would result in a numerical decline in editors-per-article with no real change in actual editorial attention to the article:

* Article in 2007, with 19 editors: Initial content written by 1 person, moderate expansions from 3 people, copyediting from 5 people, vandalism-rollback from 10 people

* Similar article in 2013, with 12 editors: Initial content written by 1 person, moderate expansions from 3 people, copyediting from 3 people and 1 typo-fixing bot, vandalism-rollback from 2 people and 2 anti-vandal bots

Basically all that happened in this hypothetical is that two of the typo-fixers were replaced by a typo-fixing bot, and 8 rollbacks that would've once been done by recent-changes patrollers were instead done by a smaller number of anti-vandal bots. Maybe that's not what the change looks like, but I don't think the raw edit-count data can tell us either way.

I think this is also a potential issue with the definition of active users, which is defined as 5 edits/month for "active" and 100 edits/month for "very active". The latter in particular much more heavily favors people who make many smaller edits versus fewer large edits. And are there editors contributing substantial amounts of content to Wikipedia who don't even hit the lower threshold? One possible group are people whose main contribution is to write new articles, and do little to no other editing. Some people write offline and then contribute a new, well-referenced article in a single edit. If that's their only involvement in Wikipedia, they wouldn't be counted as active Wikipedians in the numbers, even if they're sending us a steady stream of 1-2 new articles/month.

I'm not sure how to best answer those questions automatically. Bytes, as James suggests, could be one possible proxy, but in addition to total bytes, we could look at the editor level. Has there been a decline in "active editors" if we define active editing as changing more than N bytes in the encyclopedia in a month, not counting rollbacks? That would count people who wrote substantial new articles as active, even if they did it in only 1 or 2 edits/month (although on the other hand, it wouldn't count people who made 100 rollbacks and no other edits).

Another possibility could be to sample a subset of either articles, or of editors, and manually annotate what kind of editing is going on. More tedious and would of necessity be on a small subset of the encyclopedia, but might avoid papering over things that are obvious when you look at them but tend to get lost in big-data analyses.

-Mark

_______________________________________________
Wikimedia-l mailing list
[email protected]
Unsubscribe: https://lists.wikimedia.org/mailman/listinfo/wikimedia-l, 
<mailto:[email protected]?subject=unsubscribe>

Reply via email to