Hey Guys,

Thanks for explaining it to me. Can I have your IRC handles, I still  think
I have many doubts.

Is there a simpler bug related with extension, so I can get an Idea of it
working.

On Fri, Mar 8, 2013 at 5:23 AM, Chris Steipp <[email protected]> wrote:

> On Thu, Mar 7, 2013 at 1:34 PM, Platonides <[email protected]> wrote:
> > On 07/03/13 21:03, anubhav agarwal wrote:
> >> Hey Chris
> >>
> >> I was exploring SpamBlaklist Extension. I have some doubts hope you
> could
> >> clear them.
> >>
> >> Is there any place I can get documentation of
> >> Class SpamBlacklist in the file SpamBlacklist_body.php. ?
>
> There really isn't any documentation besides the code, but a couple
> more things you should look at. Notice that in SpamBlacklist.php,
> there is the line "$wgHooks['EditFilterMerged'][] =
> 'SpamBlacklistHooks::filterMerged';", which is the way that
> SpamBlacklist registers itself with MediaWiki core to filter edits. So
> when MediaWiki core runs the EditFilterMerged hooks (which it does in
> includes/EditPage.php, line 1287), all of the extensions that have
> registered a function for that hook are run with the passed in
> arguments, so SpamBlacklistHooks::filterMerged is run. And
> SpamBlacklistHooks::filterMerged then just sets up and calls
> SpamBlacklist::filter. So that is where you can start tracing what is
> actually in the variables, in case Platonides summary wasn't enough.
>
>
> >>
> >> In function filter what does the following variables represent ?
> >>
> >> $title
> > Title object (includes/Title.php) This is the page where it tried to
> save.
> >
> >> $text
> > Text being saved in the page/section
> >
> >> $section
> > Name of the section or ''
> >
> >> $editpage
> > EditPage object if EditFilterMerged was called, null otherwise
> >
> >> $out
> >
> > A ParserOutput class (actually, this variable name was a bad choice, it
> > looks like a OutputPage), see includes/parser/ParserOutput.php
> >
> >
> >> I have understood the following things from the code, please correct me
> if
> >> I am wrong. It extracts the edited text, and parse it to find the links.
> >
> > Actually, it uses the fact that the parser will have processed the
> > links, so in most cases just obtains that information.
> >
> >
> >> It then replaces the links which match the whitelist regex,
> > This doesn't make sense as you explain it. It builds a list of links,
> > and replaces whitelisted ones with '', ie. removes whitelisted links
> > from the list.
> >
> >> and then checks if there are some links that match the blacklist regex.
> > Yes
> >
> >> If the check is greater you return the content matched.
> >
> > Right, $check will be non-0 if the links matched the blacklist.
> >
> >> it already enters in the debuglog if it finds a match
> >
> > Yes, but that is a private log.
> > Bug 1542 talks about making that accesible in the wiki.
>
> Yep. For example, see
> * https://en.wikipedia.org/wiki/Special:Log
> * https://en.wikipedia.org/wiki/Special:AbuseLog
>
> >
> >
> >> I guess the bug aims at creating a sql table.
> >> I was thinking of the following fields to log.
> >> Title, Text, User, URLs, IP. I don't understand why you denied it.
> >
> > Because we don't like to publish the IPs *in the wiki*.
>
> The WMF privacy policy also discourages us from keeping IP addresses
> longer than 90 days, so if you do keep IPs, then you need a way to
> hide / purge them, and if they allow someone to see what IP address a
> particular username was using, then only users with checkuser
> permissions are allowed to see that. So it would be easier for you not
> to include it, but if it's desired, then you'll just have to build
> those protections out too.
>
> >
> > I think the approach should be to log matches using abusefilter
> > extension if that one is loaded.
>
> The abusefilter log format has a lot of data in it specific to
> AbuseFilter, and is used to re-test abuse filters, so adding these
> hits into that log might cause some issues. I think either the general
> log, or using a separate, new log table would be best. Just for some
> numbers, in the first 7 days of this month, we've had an average of
> 27,000 hits each day. So if this goes into an existing log, it's going
> to generate a significant amount of data.
>
> _______________________________________________
> Wikitech-l mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>



-- 
Cheers,
Anubhav


Anubhav Agarwal| 4rth Year  | Computer Science & Engineering | IIT Roorkee
_______________________________________________
Wikitech-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to