On Sat, Jul 18, 2009 at 6:20 AM, David Gerard<[email protected]> wrote:
> 2009/7/18 Alexandre Dulaunoy <[email protected]>:
>
>> I was wondering if it would be possible to allow web robots to access
>> http://upload.wikimedia.org/wikipedia/commons/ to gather and mirror
>> the media files. As this is pure HTTP, the mirroring could benefit from
>> the caching mechanisms of HTTP object (instead of having a large dump
>> containing all the media files, that is more difficult to cache/update).
>
>
> I see lots of files on upload.wikimedia.org on Google Image Search
> already. Is that actually forbidden by our robots.txt?
>
> It'd actually be better if Google properly indexed text pages whose
> name ends in .jpg or whatever ... but they're aware we'd like that, so
> it's up to them.

Which is why my personal wiki is patched to translate the ".jpg" into
"_jpg", etc. for all references to image description pages.

-Robert Rohde

_______________________________________________
Wikitech-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to