https://bugzilla.wikimedia.org/show_bug.cgi?id=29162

--- Comment #2 from FT2 <[email protected]> 2011-05-27 13:51:10 UTC ---
Very useful, however we need to check what happens if some of the alternate
URLs to the canonical page are excluded by robots.txt and some aren't. 

Does it apply the robots.txt rule that it has for the "canonical" page, to all
alternatives? Or does it get confused?

Example:

/Wikipedia%3AArbitration%2FExample is stated to have
/Wikipedia:Arbitration/Example as its canonical link. However one of these is
NOINDEXed via robots.txt or in its header and one isn't. Knowing the canonicity
helps to identify these as "duplicates" and "the same page". But does it
guarantee both of these will be treated as NOINDEXED if one of them is and the
other isn't? Or do we still have to cover all variants of the URL in
robots.txt?

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
You are on the CC list for the bug.

_______________________________________________
Wikibugs-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l

Reply via email to