If people aren't capable of following UA guidelines I doubt they're going to follow voluntary login.
For what it's worth I absolutely support both rate-limiting and login to get around this. In fact, I would argue that from an analytics point of view rate limiting is probably the most high-profile problem we have with incoming data at the moment. It's far, far too common for random pieces of automata to set themselves up and massively skew our datasets; identifying this in advance is impossible (we don't always have IP data) and identifying them post-hoc on an individual basis is massively time consuming. Why don't we have rate limiting + login? Who would work on this? Why /should/ we not have rate limiting? On 1 September 2015 at 13:37, Brion Vibber <[email protected]> wrote: > I'm not 100% convinced that the UA requirement is helpful, for two reasons: > > 1) Lots of requests will have default like "PHP" or "Python/urllib" or > whatever from the tool they used to build their bot. These aren't helpful > either as they contain no of how to get in touch. > > 2) It's trivial to work around the requirement for a non-blank UA by > setting one of the above, or worse -- cut-n-pasting the UA string from a > browser. If someone hacks this up real quick while testing, they may never > bother putting in contact information when their bot moves from a handful > of requests to gazillions. > > Auto-throttling super-high-rate API clients (by IP/IP group) and giving > them an explicit "You really should contact us and, better yet, make it > possible for us to contact you" message might be nice. > > > We may want to seriously think about some sort of API key system... not > necessarily as mandatory for access (we love freedom and convenience!) but > perhaps as the way you get around being throttled for too many accesses. > This would give us a structured way of storing their contact information, > which might be better than unstructured names or addresses in the UA. > > Does it make sense to tell people "log in to your bot's account with OAuth" > or is that too much of a pain in the ass versus "add this one parameter to > your requests with your key"? :) > > -- brion > > > On Tue, Sep 1, 2015 at 10:23 AM, Oliver Keyes <[email protected]> wrote: > >> Awesome; thanks for the analysis, Krinkle. >> >> Do we want to change this behaviour? From my point of view the answer >> is 'yes, not setting any kind of user agent is a violation of our API >> etiquette and we should be taking steps to alert people that it is' >> but if other people have different perspectives on this I'd love to >> hear them. >> >> On 1 September 2015 at 13:18, Krinkle <[email protected]> wrote: >> > I've confirmed just now that whatever requirement there was, it doesn't >> seem to be in effect. >> > >> > Both omitting the header entirely, sending it with empty string, and >> sending >> > with "-"; – all three result in a response from the MediaWiki API. >> > >> > $ curl -A '' --include -v ' >> https://en.wikipedia.org/w/api.php?action=query&format=json' < >> https://en.wikipedia.org/w/api.php?action=query&format=json'> >> >> GET /w/api.php?action=query&format=json HTTP/1.1 >> >> Host: en.wikipedia.org >> >> Accept: */* >> > < HTTP/1.1 200 OK >> > .. >> > {"batchcomplete":""} >> > >> > >> > $ curl -A '-' --include -v ' >> https://en.wikipedia.org/w/api.php?action=query&format=json' < >> https://en.wikipedia.org/w/api.php?action=query&format=json'> >> >> GET /w/api.php?action=query&format=json HTTP/1.1 >> >> User-Agent: - >> >> Host: en.wikipedia.org <http://en.wikipedia.org/> >> >> Accept: */* >> > < HTTP/1.1 200 OK >> > .. >> > {"batchcomplete":""} >> > >> > In the past (2012?) these were definitely being blocked. (Ran into it >> from time to time on Toolserver) >> > It seems php file_get_contents('http://...api..' <http://...api..'>) is >> also working fine now, >> > without having to init_set a user_agent value first. >> > >> > -- Krinkle >> > _______________________________________________ >> > Wikitech-l mailing list >> > [email protected] >> > https://lists.wikimedia.org/mailman/listinfo/wikitech-l >> >> >> >> -- >> Oliver Keyes >> Count Logula >> Wikimedia Foundation >> >> _______________________________________________ >> Wikitech-l mailing list >> [email protected] >> https://lists.wikimedia.org/mailman/listinfo/wikitech-l >> > _______________________________________________ > Wikitech-l mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/wikitech-l -- Oliver Keyes Count Logula Wikimedia Foundation _______________________________________________ Wikitech-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikitech-l
