> I am currently reviewing work on spam detection on Wikipedia. West et al. (2011) <https://dl.acm.org/doi/pdf/10.1145/2038558.2038574> found that *the length (in characters) of the revision summary* was one of the features with the greatest weight in the final classifier. Oh yeah, adding some quantitative evidence to what Jonathan pointed out about blank edit summaries being a useful signal: "Some 88% of spam leaves [the edit summary] blank..." which they indicate is in comparison to only 17% of external link additions by trusted users being without an edit summary.
On Thu, Aug 5, 2021 at 5:18 AM Pablo Aragón <[email protected]> wrote: > Hi Isaac, > > I am currently reviewing work on spam detection on Wikipedia. West et al. > (2011) <https://dl.acm.org/doi/pdf/10.1145/2038558.2038574> found that > *the > length (in characters) of the revision summary* was one of the features > with the greatest weight in the final classifier. > > Best, > > On Wed, Aug 4, 2021 at 11:46 PM Isaac Johnson <[email protected]> wrote: > > > Thanks all for the feedback! If anyone thinks of more, by all means send > > over. > > > > > 1. One of the reasons why any suggestion that we make edit summaries > > compulsory is that as long as they are optional, blank edit summaries > are a > > great way to identify vandals. > > This is a pretty interesting point. For further context, I'm asking > because > > I'm mentoring a researcher who will be looking into edit summary usage > and > > I wanted to make sure we weren't asking questions that had already been > > answered elsewhere. The research is still in the formative stages of > > figuring out what additional research might be useful and just having a > > better understanding of the distribution of edit types. When I think of > > tools / interventions based on what little I know, however, it's mainly > > along the lines of what sorts of edit tags (or similar filters) could be > > auto-generated to further contextualize edit summaries. Helping editors > > quickly match their edit to templated/canned messages is an idea that > gets > > floated around too but could be counterproductive for the vandalism case > as > > you point out. > > > > > There is a long-standing tool to search them at > > > > > https://sigma.toolforge.org/summary.py?name=Stuartyeates&search=re-review&max=500&server=enwiki&ns=Wikipedia > > In case you're looking for code to reuse. > > Thanks! Glad to see this tool exists! > > > > For completeness, it was also pointed out to me that Wattenberg, Viégas, > > and Hollenbach's 2007 paper "Visualizing Activity on Wikipedia with > > Chromograms" makes heavy use of edit summaries and provides some insight > > into their usage: > > https://link.springer.com/content/pdf/10.1007/978-3-540-74800-7_23.pdf > > > > Best, > > Isaac > > > > On Tue, Aug 3, 2021 at 3:48 PM Stuart A. Yeates <[email protected]> > wrote: > > > > > There is a long-standing tool to search them at > > > > > > > > > > > > https://sigma.toolforge.org/summary.py?name=Stuartyeates&search=re-review&max=500&server=enwiki&ns=Wikipedia > > > > > > In case you're looking for code to reuse. > > > > > > cheers > > > stuart > > > -- > > > ...let us be heard from red core to black sky > > > > > > On Wed, 4 Aug 2021 at 05:38, WereSpielChequers > > > <[email protected]> wrote: > > > > > > > > Dear Isaac, > > > > > > > > I'm not aware of any research on this. But there are a couple of > common > > > > assumptions that you could check as part of any research. > > > > > > > > > > > > 1. One of the reasons why any suggestion that we make edit > summaries > > > > compulsory is that as long as they are optional, blank edit > > summaries > > > are a > > > > great way to identify vandals. > > > > 2. There is also a certain amount of "sneaky vandalism" denoted by > > > edits > > > > that get reverted or reverted and the perpetrators get warned for > > > vandalism > > > > or blocked as a "vandalism only account" > > > > 3. Though we admins have the technology to blank people's edit > > > summaries > > > > it is very rarely used > > > > > > > > > > > > > > > > > > > > Regards > > > > Jonathan > > > > > > > > On Tue, 3 Aug 2021 at 16:20, Isaac Johnson <[email protected]> > > wrote: > > > > > > > > > Does anyone know of any research or statistics around edit summary > > > > > <https://en.wikipedia.org/wiki/Help:Edit_summary> usage on > > Wikipedia? > > > All > > > > > I > > > > > could find in a quick scan was some statistics from 2010 ( > > > > > https://meta.wikimedia.org/wiki/Usage_of_edit_summary_on_Wikipedia > ). > > > I'm > > > > > curious if anyone has more updated statistics, or, even better: a > > more > > > > > thorough analysis of how edit summaries are used by editors -- i.e. > > how > > > > > complete they are, to what degree they represent the "what" vs. the > > > "why", > > > > > how often they are misleading, etc. > > > > > > > > > > Best, > > > > > Isaac > > > > > > > > > > -- > > > > > Isaac Johnson (he/him/his) -- Research Scientist -- Wikimedia > > > Foundation > > > > > _______________________________________________ > > > > > Wiki-research-l mailing list -- > [email protected] > > > > > To unsubscribe send an email to > > > [email protected] > > > > > > > > > _______________________________________________ > > > > Wiki-research-l mailing list -- [email protected] > > > > To unsubscribe send an email to > > > [email protected] > > > _______________________________________________ > > > Wiki-research-l mailing list -- [email protected] > > > To unsubscribe send an email to > > [email protected] > > > > > > > > > -- > > Isaac Johnson (he/him/his) -- Research Scientist -- Wikimedia Foundation > > _______________________________________________ > > Wiki-research-l mailing list -- [email protected] > > To unsubscribe send an email to > [email protected] > > > _______________________________________________ > Wiki-research-l mailing list -- [email protected] > To unsubscribe send an email to [email protected] > -- Isaac Johnson (he/him/his) -- Research Scientist -- Wikimedia Foundation _______________________________________________ Wiki-research-l mailing list -- [email protected] To unsubscribe send an email to [email protected]
