ema added a comment.

I took a look at rate limited requests with User-Agent 'Wikidata Query Service Updater'. They come from non-WMF IPs (WMF addresses are currently excluded from rate limiting) and uri_path is /wiki/Special:EntityData/.

For instance, these are the number of WDQS Updater requests per second from a single IP this morning at 8AM:

scala> val df = sqlContext.sql("select * from wmf.webrequest where webrequest_source = 'text' and year = 2017 and month = 6 and day = 16 and hour = 8 and user_agent = 'Wikidata Query Service Updater'")
scala> sqlContext.sql("select minute(ts) as m, second(ts) as s, count(1) as c from t1 where ip = 'XXX.XXX.XXX.XXX' and uri_path like '/wiki/Special:EntityData/%' group by minute(ts), second(ts) order by m, s limit 10").collect().foreach(println)
[0,0,35]
[0,3,40]
[0,4,30]
[0,5,23]
[0,7,20]
[0,8,30]
[0,9,35]
[0,10,10]
[0,12,20]
[0,13,31]

With the current rate limiting of 20/s, those result in 429 responses.

So, two things:

  1. I'm gonna add /wiki/Special:EntityData/ to the list of endpoints with higher rate limiting: up to 1000 requests per 10 seconds (100/s, with 1000 burst). That should be enough given the stats above.
  1. Wikidata Query Service Updater raises an exception for any non-404 response with status code >= 300. In the specific case of 429 responses, it should definitely not crash but rather wait a bit and retry the request later. 429 "Too Many Requests" responses can contain a Retry-After header, specifying either a number of seconds to wait for (integer) or a HTTP-date after which the request can be retried. See https://tools.ietf.org/html/rfc6585#section-4 and https://tools.ietf.org/html/rfc7231#section-7.1.3. We do currently return Retry-After: 1 in our 429 responses, and that should ideally be honored.

TASK DETAIL
https://phabricator.wikimedia.org/T168019

EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: ema
Cc: ema, Smalyshev, Aklapper, Lisp.hippie, GoranSMilovanovic, Th3d3v1ls, Hfbn0, QZanden, EBjune, Marostegui, merbst, Avner, Zppix, debt, Gehel, Jonas, FloNight, Xmlizer, Izno, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles, faidon, Mbch331, Jay8g, fgiunchedi
_______________________________________________
Wikidata-bugs mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs

Reply via email to