Hi, cool, I was actually expecting someone to come out with
suggestions like this. Indeed I didn't know that and now I do. In fact
closer cooperation with cluebot is on TO-DO :-) any good algorithm to
calculate vandalism is appreciated, in fact this might be the first
thing we should create hooks for, so that people can implement own
algorithms as either c++ or python plugins which count the score just
as they like... (unfortunately I didn't manage to get python engine
working for windows build yet)

On Thu, Sep 19, 2013 at 4:47 PM, Antoine Musso <[email protected]> wrote:
> Le 19/09/13 11:35, Petr Bena a écrit :
> <snip>
>> Huggle 3 comes with vandalism-prediction as it is precaching the diffs
>> even before they are enqueued including their contents. Each edit has
>> so called "score" which is a numerical value that if higher, the edit
>> is more likely a vandalism.
>>
>> If you want to help us improve this feature, it is necessary to define
>> a "score words" list for every wiki where huggle is about to be used,
>> for example on English wiki.
>>
>> Each list has following syntax:
>>
>> (see 
>> https://en.wikipedia.org/w/index.php?title=Wikipedia:Huggle/Config&diff=573615259&oldid=573615075)
>
> The good thing while reinventing the wheel, is that you can reuse
> existing material :-]
>
> Cluebot-NG has such a list: http://review.cluebot.cluenet.org  and its a
> quite active one:
>  http://en.wikipedia.org/wiki/Special:Contributions/ClueBot_NG
>
>
> It uses a variety of algorithms to determine the score of an edit:
>  http://en.wikipedia.org/wiki/User:ClueBot_NG#Vandalism_Detection_Algorithm
>
>
> Maybe get in touch with them and reuse their engine?
>
>
> --
> Antoine "hashar" Musso
>
>
> _______________________________________________
> Wikitech-l mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l

_______________________________________________
Wikitech-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to