https://bugzilla.wikimedia.org/show_bug.cgi?id=62494
Bug ID: 62494
Summary: Add exception in robots.txt to allow the Internet
Archiver to index action=raw
Product: Wikimedia
Version: wmf-deployment
Hardware: All
OS: All
Status: NEW
Severity: normal
Priority: Unprioritized
Component: Site requests
Assignee: [email protected]
Reporter: [email protected]
CC: [email protected],
[email protected],
[email protected], [email protected],
[email protected], [email protected]
Web browser: ---
Mobile Platform: ---
Presently, robots.txt has Disallow: /w/ Thus, the raw wikitext of pages isn't
accessible via the Internet Archive; see e.g.
https://web.archive.org/web/20140307111730/http://en.wikipedia.org/w/index.php?title=Main_Page&action=raw
This is in contrast to sites like WikiIndex, which allow it:
https://web.archive.org/web/20131021230044/http://wikiindex.org/index.php?title=Welcome&action=raw
We should allow the Internet Archiver to index these pages so that the raw
wikitext will be available for future generations, even if the page goes away.
See
[[mw:Manual:Robots.txt#Allow_indexing_of_raw_pages_by_the_Internet_Archiver]].
--
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
_______________________________________________
Wikibugs-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l