Le 19/09/13 11:35, Petr Bena a écrit :
<snip>
> Huggle 3 comes with vandalism-prediction as it is precaching the diffs
> even before they are enqueued including their contents. Each edit has
> so called "score" which is a numerical value that if higher, the edit
> is more likely a vandalism.
> 
> If you want to help us improve this feature, it is necessary to define
> a "score words" list for every wiki where huggle is about to be used,
> for example on English wiki.
> 
> Each list has following syntax:
> 
> (see 
> https://en.wikipedia.org/w/index.php?title=Wikipedia:Huggle/Config&diff=573615259&oldid=573615075)

The good thing while reinventing the wheel, is that you can reuse
existing material :-]

Cluebot-NG has such a list: http://review.cluebot.cluenet.org  and its a
quite active one:
 http://en.wikipedia.org/wiki/Special:Contributions/ClueBot_NG


It uses a variety of algorithms to determine the score of an edit:
 http://en.wikipedia.org/wiki/User:ClueBot_NG#Vandalism_Detection_Algorithm


Maybe get in touch with them and reuse their engine?


-- 
Antoine "hashar" Musso


_______________________________________________
Wikitech-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to