Combined replies to various posts below.

Steve Summit wrote:
> A few of us -- though I
> fear an inconsequential minority -- are concerned that this is a
> destabilizing change, being made in a hurry, by a top-10 website, 
> with consequences that aren't easy to predict and (apparently) 
> haven't even been thought about.

Not entirely an inconsequential minority. Google complained by email
that it broke a Google Translate feature, they got an IP-based
exemption while they develop and deploy a fix.

In another post:
> Domas wrote:
>> > Hi Steve,
>>> > > But why?
>> >
>> > Because we need to identify malicious behavior. 
> 
> You're trying to detect / guard against malicious behavior using
> *User-Agent*??  Good grief.  Have fun with the whack-a-mole game, then.

Well yeah. We've had malicious traffic in the past that hasn't been
easily filterable by request headers. The response was to create a
list of the IP addresses causing the most traffic and to block them at
Squid. Squid is reasonably well-optimised for this, it stores blocked
IPs and ranges in a tree, giving you lookup in O(log N) time in the
number of blocked IPs.

That would have been more work, and I appreciate that the sysadmin
team is small and needs to allocate their time carefully. It's not my
job to tell them how to do that and I wasn't offerring to help.

But note that the action taken wasn't to block all list=search API
queries that have a blank user agent header. The overly broad response
should give you a hint that there was another motive at work.

I think they want to make their work easier in the future. Although it
doesn't help much with malicious traffic, requiring a User-Agent
header does help to distinguish different sources of non-malicious but
excessively expensive traffic.

In another post:
> When the new code blocks requests with missing User Agent strings
> (which is, oddly, not all of the time), it is with a 403
> Forbidden response and the very simple message
> 
>        Please provide a User-Agent header
> 
> (No <html> tags, no nothing.)

Be glad it doesn't just say "sigh".

if( $wgDBname == 'kuwiki'
    && preg_match( '/\[\[Image:Flag_of_Turkey.svg\]\]/',
       @$_REQUEST['wpTextbox1'] ) )
{
    die("Sigh.\n");
}

Seriously.

http://ku.wikipedia.org/w/index.php?wpTextbox1=[[Image:Flag_of_Turkey.svg]]

Domas Mituzas wrote:
> Actually we had User-Agent header requirement for ages, it just
> failed to do what it had to do for a while. Consider this to be a
> bugfix.

For the record, I didn't like the idea the first time around either.

-- Tim Starling


_______________________________________________
Wikitech-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to