If people aren't capable of following UA guidelines I doubt they're
going to follow voluntary login.

For what it's worth I absolutely support both rate-limiting and login
to get around this. In fact, I would argue that from an analytics
point of view rate limiting is probably the most high-profile problem
we have with incoming data at the moment. It's far, far too common for
random pieces of automata to set themselves up and massively skew our
datasets; identifying this in advance is impossible (we don't always
have IP data) and identifying them post-hoc on an individual basis is
massively time consuming.

Why don't we have rate limiting + login? Who would work on this? Why
/should/ we not have rate limiting?

On 1 September 2015 at 13:37, Brion Vibber <[email protected]> wrote:
> I'm not 100% convinced that the UA requirement is helpful, for two reasons:
>
> 1) Lots of requests will have default like "PHP" or "Python/urllib" or
> whatever from the tool they used to build their bot. These aren't helpful
> either as they contain no of how to get in touch.
>
> 2) It's trivial to work around the requirement for a non-blank UA by
> setting one of the above, or worse -- cut-n-pasting the UA string from a
> browser. If someone hacks this up real quick while testing, they may never
> bother putting in contact information when their bot moves from a handful
> of requests to gazillions.
>
> Auto-throttling super-high-rate API clients (by IP/IP group) and giving
> them an explicit "You really should contact us and, better yet, make it
> possible for us to contact you" message might be nice.
>
>
> We may want to seriously think about some sort of API key system... not
> necessarily as mandatory for access (we love freedom and convenience!) but
> perhaps as the way you get around being throttled for too many accesses.
> This would give us a structured way of storing their contact information,
> which might be better than unstructured names or addresses in the UA.
>
> Does it make sense to tell people "log in to your bot's account with OAuth"
> or is that too much of a pain in the ass versus "add this one parameter to
> your requests with your key"? :)
>
> -- brion
>
>
> On Tue, Sep 1, 2015 at 10:23 AM, Oliver Keyes <[email protected]> wrote:
>
>> Awesome; thanks for the analysis, Krinkle.
>>
>> Do we want to change this behaviour? From my point of view the answer
>> is 'yes, not setting any kind of user agent is a violation of our API
>> etiquette and we should be taking steps to alert people that it is'
>> but if other people have different perspectives on this I'd love to
>> hear them.
>>
>> On 1 September 2015 at 13:18, Krinkle <[email protected]> wrote:
>> > I've confirmed just now that whatever requirement there was, it doesn't
>> seem to be in effect.
>> >
>> > Both omitting the header entirely, sending it with empty string, and
>> sending
>> > with "-"; – all three result in a response from the MediaWiki API.
>> >
>> > $ curl -A '' --include -v '
>> https://en.wikipedia.org/w/api.php?action=query&format=json' <
>> https://en.wikipedia.org/w/api.php?action=query&format=json'>
>> >> GET /w/api.php?action=query&format=json HTTP/1.1
>> >> Host: en.wikipedia.org
>> >> Accept: */*
>> > < HTTP/1.1 200 OK
>> > ..
>> > {"batchcomplete":""}
>> >
>> >
>> > $ curl -A '-' --include -v '
>> https://en.wikipedia.org/w/api.php?action=query&format=json' <
>> https://en.wikipedia.org/w/api.php?action=query&format=json'>
>> >> GET /w/api.php?action=query&format=json HTTP/1.1
>> >> User-Agent: -
>> >> Host: en.wikipedia.org <http://en.wikipedia.org/>
>> >> Accept: */*
>> > < HTTP/1.1 200 OK
>> > ..
>> > {"batchcomplete":""}
>> >
>> > In the past (2012?) these were definitely being blocked. (Ran into it
>> from time to time on Toolserver)
>> > It seems php file_get_contents('http://...api..' <http://...api..'>) is
>> also working fine now,
>> > without having to init_set a user_agent value first.
>> >
>> > -- Krinkle
>> > _______________________________________________
>> > Wikitech-l mailing list
>> > [email protected]
>> > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>>
>>
>>
>> --
>> Oliver Keyes
>> Count Logula
>> Wikimedia Foundation
>>
>> _______________________________________________
>> Wikitech-l mailing list
>> [email protected]
>> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>>
> _______________________________________________
> Wikitech-l mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l



-- 
Oliver Keyes
Count Logula
Wikimedia Foundation

_______________________________________________
Wikitech-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to