https://bugzilla.wikimedia.org/show_bug.cgi?id=33406
--- Comment #7 from Mark A. Hershberger <[email protected]> 2011-12-30 04:50:34 UTC --- (In reply to comment #4) > I SERIOUSLY doubt that robots.txt is doing ANYTHING to help lower issues we > have here. Vandalism works even though we have a robots.txt so naturally it's > completely ignoring that. And I know for a fact that e-mail addresses are > already being harvested from our bugtracker, so robots.txt isn't helping > there. Just to be clear, I wasn't saying that we are keeping vandalism at bay by having a stricter robots.txt file. As pointed out in comment #1, there are plenty of links to the tracker all over the internet that vandals could follow if that was how they found bug trackers to play with. In the past (perhaps less so currently?) "well behaved" spiders that respected robots.txt have routinely wreaked havoc on sites like this one that are, essentially, a bunch of cgi scripts that result in a process being forked for each request. So, last week, we dealt with some apparent vandalism when someone brought the server to a halt by requesting a particular URL over and over. My point was simply that if we suddenly make bugzilla visible to spiders who respect robots.txt, they would probably send a ton of queries to the server (e.g. several spiders from each search engine) to quickly discover the newly available data. That sort of sudden visibility could very well look a lot like the vandalism we saw last week. That said, something like https://bugzilla.mozilla.org/robots.txt is a good thing to consider. -- Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug. _______________________________________________ Wikibugs-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikibugs-l
