https://bugzilla.wikimedia.org/show_bug.cgi?id=61553

--- Comment #3 from LFaraone <wikipe...@luke.wf> ---
(In reply to Andre Klapper from comment #1)
> > The escaping results in incredibly-verbose robots.txt rules
> 
> ...which is not a problem per-se.
> 
> > but even our existing rules don't account for %2Fs in place of "/"s. 
> 
> So these rules could be fixed to support that too?
> 
> > We should either redirect or reject these URLs.
> 
> I cannot follow yet the advantage of this proposal. 
> 
> What is the problem you would like to solve in this bug report?

Yes, we could try and guess *every* *single* *possible* encoding of a URL and
include it in robots.txt.

So, for WT:BLP/N, that means we'll need to have these entries:

Disallow: /wiki/Wikipedia_talk:Biographies_of_living_persons/Noticeboard/
Disallow: /wiki/Wikipedia_talk:Biographies_of_living_persons/Noticeboard%2F*
Disallow: /wiki/Wikipedia_talk:Biographies_of_living_persons%2FNoticeboard/
Disallow: /wiki/Wikipedia_talk:Biographies_of_living_persons%2FNoticeboard%2F*
Disallow: /wiki/Wikipedia_talk%3ABiographies_of_living_persons/Noticeboard/
Disallow: /wiki/Wikipedia_talk%3ABiographies_of_living_persons/Noticeboard%2F*
Disallow: /wiki/Wikipedia_talk%3ABiographies_of_living_persons%2FNoticeboard/
Disallow:
/wiki/Wikipedia_talk%3ABiographies_of_living_persons%2FNoticeboard%2F*


This way leads madness. 

We want to express one specific thing, disallowing access to pages that are
below an article name path. To accomplish that, we need 8 (!!) rules. This
makes the list hard to manage, especially since it is edited by hand. We'll
almost certainly miss things.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
_______________________________________________
Wikibugs-l mailing list
Wikibugs-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l

Reply via email to