There have been a few publication on the subject:
1. "Us vs. them: Understanding social dynamics in Wikipedia with revert
graph visualizations", B Suh, EH Chi, BA Pendleton.
2. "He says, she says: Conflict and coordination in Wikipedia.", A Kittur, B
Suh, BA Pendleton.


>From my experience I can tell that analyzing MD5s is not enough to identify
all reverts.
And there are some tricks even to these. Generally you need to have
knowledge about user reputations,
article content, comment content to identify true reverts.


There are several groups of reverts which can be loosely identified as:
 * regular reverts;
 * self-reverts;
 * revert wars;

You need to take care of these cases when identifying reverts.


Some cases can be tricky, for example:
 # Marking : between duplicates, by other users (reverted, questionable)
 # Revision 54 (regular edit)          User0    Regular edit
 # Revision 55 (regular edit)          User1    Regular edit
 # Revision 56 (revert to 54)          User2    Vandalism
 # Revision 57 (vandalism)            User2    Vandalism
 # Revision 58 (revert to 56/54)     User3    Correcting vandalism, but not
quite
 # Revision 59 (revert to 55)          User4    Revert to Revision 55

Note that User 2 had tried to hide his 'revert vandalism' with regular
vandalism,
this had misled User3, but was finally corrected by User4.

Blanking also creates duplicate MD5 signatures, you need to take care of
these.
And of course users do reverts manually (and in some cases not exactly).

If you familiar with Python, you may want to take a look at the following
code:
lookup line 444: def analyze_reverts(revisions) in the:
 http://code.google.com/p/pymwdat/source/browse/trunk/toolkit.py


-- Best, Dmitry



On Thu, Aug 18, 2011 at 2:40 AM, Flöck, Fabian <[email protected]>wrote:

> Hi,
>
> I'm trying to detect reverts in Wikipedia for my research, right now with a
> self-built script using MD5hashes and DIFFs between revisions. I always read
> about people taking reverts into account in their data, but it's seldomly
> described HOW exactly a revert is determined or what tool they use to do
> that. Can you point me to any research or tools or tell me maybe what you
> used in your own research to identify which edits were reverted and/or who
> reverted them?
>
> Best,
>
> Fabian
>
>
>
>
> --
> Karlsruhe Institute of Technology (KIT)
> Institute of Applied Informatics and Formal Description Methods
>
> Dipl.-Medwiss. Fabian Flöck
> Research Associate
>
> Building 11.40, Room 222
> KIT-Campus South
> D-76128 Karlsruhe
>
> Phone: +49 721 608 4 6584
> Skype: f.floeck_work
> E-Mail: [email protected]
> WWW: http://www.aifb.kit.edu/web/Fabian_Flöck
>
> KIT – University of the State of Baden-Wuerttemberg and
> National Research Center of the Helmholtz Association
>
>
> _______________________________________________
> Wiki-research-l mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
>
_______________________________________________
Wiki-research-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l

Reply via email to