On Thu, May 31, 2012 at 11:08 AM, Carl (CBM) <[email protected]> wrote: > There is a redacted (no user info) table in the toolserver database > that can be used to count the number of editors who watchlist a page. > I fetched the counts for the 100 articles and found the median.
Ah. That's interesting to know and useful for context, thank you. On Thu, May 31, 2012 at 11:59 AM, WereSpielChequers <[email protected]> wrote: > Firstly rather than measure vandalism it created vandalism, and vandalism > that didn't look like typical vandalism. Aside from the ethical issue > involved, this will have skewed the result. As I've said multiple times, this was a designed feature, and not a bug. The goal was not to measure the broadest possible kind of vandalism's reversion rate, as that has been amply studied*, but a specific kind. Complaining that this specific kind is 'skewed' compared to 'all possible vandalism related to external links' is to miss the point. * Wikipedia generally does very well on *obvious* vandalism, and especially since the introduction of anti-vandalism bots with machine learning techniques. There's no need for anyone to spend time measuring it except perhaps bot-writers to finetune their statistics. > In particular the edit > summaries were very atypical for vandalism, if I'd seen that edit summary > on my watchlist I would probably have just sighed and taken it as another > example of deletionism in action. I propose a version of http://en.wikipedia.org/wiki/Poe%27s_law - it is impossible to create an example of deletionism mindless enough to be detectable as such if it comes with jargon attached. > Of the more than 13,000 pages on my > watchlist I doubt there are 13 where I would look at such an edit, and > that's if it was one of the changes on my watchlist that I was even aware > of - it is far too big to fully check every day. Most IP vandals don't use > jargon in edit summaries, and I know I'm not the only editor who is more > suspicious of IP edits with blank edit summaries. > > You only ran the experiment for one month. I often revert older vandalism > than that, I may be unusual there in that I've got some tools for finding > vandalism that has got past the hugglers, but I'm not unusual in sometimes > taking articles back to the "last clean version". You are unusual. When I was spending time reading academic publications on Wikipedia a few years ago, a number of them dealt with quantifying vandalism and reversions; almost all vandalism was reverted within days, and reversions which took longer than a month were very rare (0-10%, IIRC, to be very generous). This was why I chose to wait a month, because waiting longer added nothing. A week would have been adequate. There are a number of related papers, but for brevity's sake take ftp://193.206.140.34/mirrors/epics-at-lnl/WikiDumps/localhost/group282-priedhorsky.pdf which found a exponential distribution for ordinary vandalism: > 42% of damage incidents are repaired essentially immediately (i.e., within > one estimated view). This result is roughly consistent with the work of Vi > ́gas et al. [20], which showed that the median persistence of certain types > of damage was 2.8 minutes. However, 11% of incidents persist beyond 100 > views, 0.75% – 15,756 incidents – beyond 1000 views, and 0.06% – 1,260 > incidents – beyond 10,000 views. On average, the articles concerned had less than 100 page views a day going off stats.grok.se, so by just a few days, most of the edits should have been reverted - if they were going to be, of course. This sort of behavior is why you see such different averages and medians when you go looking in papers; eg - ["Measuring Wikipedia"](http://eprints.rclis.org/bitstream/10760/6207/1/MeasuringWikipedia2005.pdf), Voss 2005 - ["Studying Cooperation and Conflict between Authors with history flow Visualizations"](http://alumni.media.mit.edu/~fviegas/papers/history_flow.pdf), Viégas et al 2003 - ["Detecting Wikipedia vandalism via spatio-temporal analysis of revision metadata?"](http://repository.upenn.edu/cgi/viewcontent.cgi?article=1963&context=cis_reports), West 2010 - ["User Contribution and Trust in Wikipedia"](http://www.ics.uci.edu/~sjavanma/CollabCom), Javanmardi et al - ["He says, she says: conflict and coordination in Wikipedia"](http://nguyendangbinh.org/Proceedings/CHI/2007/docs/p453.pdf), Kittur et al 2007 On Thu, May 31, 2012 at 12:03 PM, Thomas Morton <[email protected]> wrote: > This, I think, is a major issue which make the results useless > > * The edit summary implies policy knowledge, I'd only check an edit like > that on my watchlist on occasion. And deletionists have no policy knowledge? -- gwern http://www.gwern.net _______________________________________________ WikiEN-l mailing list [email protected] To unsubscribe from this mailing list, visit: https://lists.wikimedia.org/mailman/listinfo/wikien-l
