Hey Guys, Thanks for explaining it to me. Can I have your IRC handles, I still think I have many doubts.
Is there a simpler bug related with extension, so I can get an Idea of it working. On Fri, Mar 8, 2013 at 5:23 AM, Chris Steipp <[email protected]> wrote: > On Thu, Mar 7, 2013 at 1:34 PM, Platonides <[email protected]> wrote: > > On 07/03/13 21:03, anubhav agarwal wrote: > >> Hey Chris > >> > >> I was exploring SpamBlaklist Extension. I have some doubts hope you > could > >> clear them. > >> > >> Is there any place I can get documentation of > >> Class SpamBlacklist in the file SpamBlacklist_body.php. ? > > There really isn't any documentation besides the code, but a couple > more things you should look at. Notice that in SpamBlacklist.php, > there is the line "$wgHooks['EditFilterMerged'][] = > 'SpamBlacklistHooks::filterMerged';", which is the way that > SpamBlacklist registers itself with MediaWiki core to filter edits. So > when MediaWiki core runs the EditFilterMerged hooks (which it does in > includes/EditPage.php, line 1287), all of the extensions that have > registered a function for that hook are run with the passed in > arguments, so SpamBlacklistHooks::filterMerged is run. And > SpamBlacklistHooks::filterMerged then just sets up and calls > SpamBlacklist::filter. So that is where you can start tracing what is > actually in the variables, in case Platonides summary wasn't enough. > > > >> > >> In function filter what does the following variables represent ? > >> > >> $title > > Title object (includes/Title.php) This is the page where it tried to > save. > > > >> $text > > Text being saved in the page/section > > > >> $section > > Name of the section or '' > > > >> $editpage > > EditPage object if EditFilterMerged was called, null otherwise > > > >> $out > > > > A ParserOutput class (actually, this variable name was a bad choice, it > > looks like a OutputPage), see includes/parser/ParserOutput.php > > > > > >> I have understood the following things from the code, please correct me > if > >> I am wrong. It extracts the edited text, and parse it to find the links. > > > > Actually, it uses the fact that the parser will have processed the > > links, so in most cases just obtains that information. > > > > > >> It then replaces the links which match the whitelist regex, > > This doesn't make sense as you explain it. It builds a list of links, > > and replaces whitelisted ones with '', ie. removes whitelisted links > > from the list. > > > >> and then checks if there are some links that match the blacklist regex. > > Yes > > > >> If the check is greater you return the content matched. > > > > Right, $check will be non-0 if the links matched the blacklist. > > > >> it already enters in the debuglog if it finds a match > > > > Yes, but that is a private log. > > Bug 1542 talks about making that accesible in the wiki. > > Yep. For example, see > * https://en.wikipedia.org/wiki/Special:Log > * https://en.wikipedia.org/wiki/Special:AbuseLog > > > > > > >> I guess the bug aims at creating a sql table. > >> I was thinking of the following fields to log. > >> Title, Text, User, URLs, IP. I don't understand why you denied it. > > > > Because we don't like to publish the IPs *in the wiki*. > > The WMF privacy policy also discourages us from keeping IP addresses > longer than 90 days, so if you do keep IPs, then you need a way to > hide / purge them, and if they allow someone to see what IP address a > particular username was using, then only users with checkuser > permissions are allowed to see that. So it would be easier for you not > to include it, but if it's desired, then you'll just have to build > those protections out too. > > > > > I think the approach should be to log matches using abusefilter > > extension if that one is loaded. > > The abusefilter log format has a lot of data in it specific to > AbuseFilter, and is used to re-test abuse filters, so adding these > hits into that log might cause some issues. I think either the general > log, or using a separate, new log table would be best. Just for some > numbers, in the first 7 days of this month, we've had an average of > 27,000 hits each day. So if this goes into an existing log, it's going > to generate a significant amount of data. > > _______________________________________________ > Wikitech-l mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/wikitech-l > -- Cheers, Anubhav Anubhav Agarwal| 4rth Year | Computer Science & Engineering | IIT Roorkee _______________________________________________ Wikitech-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikitech-l
