On 1/23/09 2:36 AM, Andre Engels wrote:
> Two questions:
> 1. Why is this User Agent getting this response? If I remember
> correctly, this was installed in the early days of the pywikipediabot,
> when Brion wanted to block it because it had a programming error
> causing it to fetch each page twice (sometimes even more?). If that is
> the actual reason, I see no reason why it should still be active years
> afterward...
This has nothing to do with pywikipediabot.
We too frequently encountered poorly-written bots and site-scrapers
which slammed the servers too hard and caused problems. Blocking default
UAs of common libraries cut these incidents down dramatically, and helps
encourage thoughtful bot writers to put specific information into their
user-agent string, making it possible to track them down more easily if
they are problematic.
> 2. If this User Agent is really to be blocked, why do we still provide
> the content of the page that is forbidden?
We don't; you get a big fat Wikimedia-customized error page with a
generic multilingual message, and this bit somewhere in the middle:
<!-- Technical details of the error; shows all the time, with any
language -->
<div class="TechnicalStuff">
<bdo dir="ltr">
Request: GET http://en.wikipedia.org/wiki/Foo, from 69.17.48.227
via sq24.wikimedia.org (squid/2.6.STABLE21) to ()<br/>
Error: ERR_ACCESS_DENIED, errno [No Error] at Fri, 23 Jan 2009
17:59:46 GMT
</bdo>
<div id="AdditionalTechnicalStuff"></div>
</div>
-- brion
_______________________________________________
Wikitech-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l