Le 19/09/13 11:35, Petr Bena a écrit : <snip> > Huggle 3 comes with vandalism-prediction as it is precaching the diffs > even before they are enqueued including their contents. Each edit has > so called "score" which is a numerical value that if higher, the edit > is more likely a vandalism. > > If you want to help us improve this feature, it is necessary to define > a "score words" list for every wiki where huggle is about to be used, > for example on English wiki. > > Each list has following syntax: > > (see > https://en.wikipedia.org/w/index.php?title=Wikipedia:Huggle/Config&diff=573615259&oldid=573615075)
The good thing while reinventing the wheel, is that you can reuse existing material :-] Cluebot-NG has such a list: http://review.cluebot.cluenet.org and its a quite active one: http://en.wikipedia.org/wiki/Special:Contributions/ClueBot_NG It uses a variety of algorithms to determine the score of an edit: http://en.wikipedia.org/wiki/User:ClueBot_NG#Vandalism_Detection_Algorithm Maybe get in touch with them and reuse their engine? -- Antoine "hashar" Musso _______________________________________________ Wikitech-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikitech-l
