Il 08/04/25 18:08, Giuseppe Lavagetto ha scritto:
I’ve  updated our Robot Policy[0], which was vastly outdated, the main
revision being from 2009.

Thanks for working on an update! It seems there was a misalignment of expectations, which is in itself a problem to fix.


The new policy isn’t more restrictive than the older one for general
crawling of the site or the API; on the contrary we allow higher limits
than previously stated.

I find this hard to believe, considering this new sentence for upload.wikimedia.org: «Always keep a total concurrency of at most 2, and limit your total download speed to 25 Mbps (as measured over 10 second intervals).»

This is a ridiculously low limit. It's a speed which is easy to breach in casual browsing of Wikimedia Commons categories, let alone with any kind of media-related bots.

At the suggested speed, it would take over 150 years for a person to download Wikimedia Commons files alone.

Needless to say, I breached such a threshold all the time when I compiled the https://archive.org/details/wikimediacommons collection. I typically aimed to saturate my upload bandwidth at all times when updating it, so I must have tried to download at about 100 Mbps, and it still took me months. (I used to run those scripts from my home in Milan, downloading the files to an external HDD. I stopped updating the collection after 2016 in part because I don't have FTTH in Helsinki, and the daily downloads were far too big for any storage in Wikimedia Cloud.)

I appreciate that some exceptions for Wikimedia Cloud bots were added after the discussion at https://phabricator.wikimedia.org/T391020#10716478 , but the fact remains that this comes off as a big change.

Il 09/04/25 19:10, AntiCompositeNumber ha scritto:
> I'll just note that both API:Etiquette and the Robot Policy have been incorporated by reference into the Terms of Use: https://foundation.wikimedia.org/wiki/Policy:Terms_of_Use/en#12._API_Terms
>
> Undiscussed changes to the Terms of Use should be avoided.

This is a good point.

There are parts of the terms of use which assume the [[m:Right to fork]] is upheld by the availability of mirrored dumps. But the media tarballs have not been updated since 2012. Now in effect the WMF is explicitly saying that no mirrors are allowed for media, unless by gracious exemption to individual requesters.

Best,
        Federico
_______________________________________________
Wikitech-l mailing list -- wikitech-l@lists.wikimedia.org
To unsubscribe send an email to wikitech-l-le...@lists.wikimedia.org
https://lists.wikimedia.org/postorius/lists/wikitech-l.lists.wikimedia.org/

Reply via email to