On Sat, Jul 18, 2009 at 6:20 AM, David Gerard<[email protected]> wrote: > 2009/7/18 Alexandre Dulaunoy <[email protected]>: > >> I was wondering if it would be possible to allow web robots to access >> http://upload.wikimedia.org/wikipedia/commons/ to gather and mirror >> the media files. As this is pure HTTP, the mirroring could benefit from >> the caching mechanisms of HTTP object (instead of having a large dump >> containing all the media files, that is more difficult to cache/update). > > > I see lots of files on upload.wikimedia.org on Google Image Search > already. Is that actually forbidden by our robots.txt? > > It'd actually be better if Google properly indexed text pages whose > name ends in .jpg or whatever ... but they're aware we'd like that, so > it's up to them.
Which is why my personal wiki is patched to translate the ".jpg" into "_jpg", etc. for all references to image description pages. -Robert Rohde _______________________________________________ Wikitech-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikitech-l
