GoranSMilovanovic added a comment.

  `Thu 26 Mar 2020 11:35:59 PM UTC`
  
  - a sample of SPARQL queries from `wmf.webrequest` was obtained by randomly 
sampling 1% of all queries that were sent out to WDQS on each day from 
`2020-03-01` to `2020-03-20`;
  - the sample is now cleaned by removing all `http_status == 4**` (client-side 
errors; I guess we are not interested in malformed queries), checked for 
consistency, and `URLdecoded()` so that it encompasses only SPARQL code in the 
`uri_query` field;
  - next steps:
    - exploratory data analysis;
    - feature engineering: describing queries from their SPARQL language 
constituents.

TASK DETAIL
  https://phabricator.wikimedia.org/T248308

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: GoranSMilovanovic
Cc: Addshore, Lydia_Pintscher, WMDE-leszek, Aklapper, darthmon_wmde, Nandana, 
Lahi, Gq86, GoranSMilovanovic, QZanden, LawExplorer, _jensen, rosalieper, 
Scott_WUaS, Wikidata-bugs, aude, Mbch331
_______________________________________________
Wikidata-bugs mailing list
Wikidata-bugs@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs

Reply via email to