GoranSMilovanovic added a comment.

  @Lydia_Pintscher We forgot to mention this task in our recent 1:1. In the 
meantime, I've tested a 10% daily queries  sample and the statistics of the 
smaller, previously used 1% daily queries sample, turn out to be quite 
representative. However, if tabulation - e.g. counts and average query response 
times, and similar, per user agent - is really all that we need here, then we 
do not need to sample anything at all, just let PySpark do it in the Analytics 
Cluster and follow everything up to some amount of time in the past.

TASK DETAIL
  https://phabricator.wikimedia.org/T248308

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: GoranSMilovanovic
Cc: Samantha_Alipio_WMDE, MGerlach, JAllemandou, Lucas_Werkmeister_WMDE, 
Simon_Villeneuve, dcausse, Jakob_WMDE, Gehel, Addshore, Lydia_Pintscher, 
WMDE-leszek, Aklapper, darthmon_wmde, CBogen, Akuckartz, Nandana, Namenlos314, 
Lahi, Gq86, GoranSMilovanovic, QZanden, EBjune, merbst, LawExplorer, _jensen, 
rosalieper, Scott_WUaS, Jonas, Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, 
Tobias1984, Manybubbles, Mbch331
_______________________________________________
Wikidata-bugs mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs

Reply via email to