Hi Tim, If there's a problem with viewing past versions of the main page, that's perfectly okay -- it can be excluded from the resources that are datetime content negotiable like the Special: pages.
I admit to not following the second issue completely. A regular robot would never issue the X-Accept-Datetime to jump back in time, so that's okay. A regular robot would also respect the history page policy and not crawl backwards either, as you say. A robot that did issue X-Accept-Datetime would end up crawling old revision pages and never hit a history list, but this could also be forbidden via robots.txt if the revision pages were excluded too? However, that seems like it's a long time off before people write past-web crawlers and the use case for even doing it at all is pretty hard to come up with. :) Hope this addresses your concerns! Rob On Thu, Nov 12, 2009 at 5:15 PM, Tim Starling <[email protected]>wrote: > Daniel Kinzler wrote: > > Hi all > > > > The Memento Project <http://www.mementoweb.org/> (including the Los > Alamos > > National Laboratory (!) featuring Herbert Van de Sompel of OpenURL fame) > is > > proposing a new HTTP header, X-Accept-Datetime, to fetch old versions of > a web > > resource. They already wrote a MediaWiki extension for this > > <http://www.mediawiki.org/wiki/Extension:Memento> - which would of > course be > > particularly interesting for use on Wikipedia. > > > > Do you think we could have this for Wikimedia project? I think that would > be > > very nice indeed. I recall that ways to look at last weeks main page have > been > > discussed before, and I see several issues: > > > You can't view the main page as it was in the past, because users > routinely upload temporary images to display there, so that they can > be protected, and then delete them once they're off the page. > > Also, we can't have people crawling Wikipedia while requesting old > versions, because of the excessive disk seeking and CPU usage that > would generate. That's why the history page has a robot policy of > noindex, nofollow. > > -- Tim Starling > > > _______________________________________________ > Wikitech-l mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/wikitech-l > _______________________________________________ Wikitech-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikitech-l
