| ema added a comment. |
I took a look at rate limited requests with User-Agent 'Wikidata Query Service Updater'. They come from non-WMF IPs (WMF addresses are currently excluded from rate limiting) and uri_path is /wiki/Special:EntityData/.
For instance, these are the number of WDQS Updater requests per second from a single IP this morning at 8AM:
scala> val df = sqlContext.sql("select * from wmf.webrequest where webrequest_source = 'text' and year = 2017 and month = 6 and day = 16 and hour = 8 and user_agent = 'Wikidata Query Service Updater'")
scala> sqlContext.sql("select minute(ts) as m, second(ts) as s, count(1) as c from t1 where ip = 'XXX.XXX.XXX.XXX' and uri_path like '/wiki/Special:EntityData/%' group by minute(ts), second(ts) order by m, s limit 10").collect().foreach(println)
[0,0,35]
[0,3,40]
[0,4,30]
[0,5,23]
[0,7,20]
[0,8,30]
[0,9,35]
[0,10,10]
[0,12,20]
[0,13,31]With the current rate limiting of 20/s, those result in 429 responses.
So, two things:
- I'm gonna add /wiki/Special:EntityData/ to the list of endpoints with higher rate limiting: up to 1000 requests per 10 seconds (100/s, with 1000 burst). That should be enough given the stats above.
- Wikidata Query Service Updater raises an exception for any non-404 response with status code >= 300. In the specific case of 429 responses, it should definitely not crash but rather wait a bit and retry the request later. 429 "Too Many Requests" responses can contain a Retry-After header, specifying either a number of seconds to wait for (integer) or a HTTP-date after which the request can be retried. See https://tools.ietf.org/html/rfc6585#section-4 and https://tools.ietf.org/html/rfc7231#section-7.1.3. We do currently return Retry-After: 1 in our 429 responses, and that should ideally be honored.
TASK DETAIL
EMAIL PREFERENCES
To: ema
Cc: ema, Smalyshev, Aklapper, Lisp.hippie, GoranSMilovanovic, Th3d3v1ls, Hfbn0, QZanden, EBjune, Marostegui, merbst, Avner, Zppix, debt, Gehel, Jonas, FloNight, Xmlizer, Izno, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, faidon, Mbch331, Jay8g, fgiunchedi
Cc: ema, Smalyshev, Aklapper, Lisp.hippie, GoranSMilovanovic, Th3d3v1ls, Hfbn0, QZanden, EBjune, Marostegui, merbst, Avner, Zppix, debt, Gehel, Jonas, FloNight, Xmlizer, Izno, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, faidon, Mbch331, Jay8g, fgiunchedi
_______________________________________________ Wikidata-bugs mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
