Not only that, but we'd also have to exclude reverts. If someone replaces an article with "LULZ I HAX WIKI", and I revert that, the software will see me as "adding" all the text that was previously there, but of course I didn't in any reasonable sense actually do that.
Todd On Sat, Jun 22, 2019 at 9:38 AM WereSpielChequers < [email protected]> wrote: > Dear Haifeng Zhang, > > If I were you, looking at this, I'd watch out for templates. Templates > particularly substituted ones involve a lot of bytes that someone hasn't > typed. I recently did an edit that involved me typing {{subst|Infobox > academic}} you might be surprised how many bytes that generated. And how > many more key depressions that edit involved compared to my typical edit. > Similarly reversion can involve adding a lot of bytes, but on further > inspection you might simple be reverting a vandal who removed four > paragraphs of text that others had contributed. > > You might also want to look at an editors edit rate per hour, and time > since their previous edit. If their previous edit was half an hour earlier > they might have been making a cup of tea, cutting the grass or taking a > phone call, or they might have spent half an hour on that edit. But if they > have made forty edits in that previous half hour then you are pretty safe > to assume that those edits on average represent less than a minute of work. > > As well as what Kerry said, there are two things you might want to take > into consideration. Firstly those of us with experience of breaking news > stories quickly learn the hard way to save little and often, especially on > a topical subject. Take for example the article on Sarah Palin in the hours > after she was announced as John McCain's running mate. My memory was of > multiple concurrent edit wars and a tidal wave of vandalism, I went back > later and measured it as peaking at 25 edits per minute, I don't think we > even log the edits lost to edit conflicts, but in practice anyone clicking > the edit button at the top was going to get an edit conflict - your only > chance of getting an edit to save would have been to edit by section. > > Secondly, over time editors pick up tools, some of which make a big > difference to edit rates. Edit summaries are a good indicator of this, > watch for words such as Twinkle, Hotcat, Huggle and AWB. I haven't used > Catalot on Wikipedia, but it is the reason why my edit count is higher on > Wikimedia commons, despite my spending rather more time on Wikipedia. > > Regards > > Jonathan > > > > On Fri, 7 Jun 2019 at 22:44, Haifeng Zhang <[email protected]> > wrote: > > > Dear folks, > > > > Are there studies that have examined what might affect edit size (e.g., # > > of words add/delete/modify in each revision). I am especially interested > in > > the impact of editor's tenure/experience. > > > > Thanks, > > Haifeng Zhang > > _______________________________________________ > > Wiki-research-l mailing list > > [email protected] > > https://lists.wikimedia.org/mailman/listinfo/wiki-research-l > > > _______________________________________________ > Wiki-research-l mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/wiki-research-l > _______________________________________________ Wiki-research-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
