GoranSMilovanovic added a comment.
`Thu 26 Mar 2020 11:35:59 PM UTC`
- a sample of SPARQL queries from `wmf.webrequest` was obtained by randomly
sampling 1% of all queries that were sent out to WDQS on each day from
`2020-03-01` to `2020-03-20`;
- the sample is now cleaned by removing all `http_status == 4**` (client-side
errors; I guess we are not interested in malformed queries), checked for
consistency, and `URLdecoded()` so that it encompasses only SPARQL code in the
`uri_query` field;
- next steps:
- exploratory data analysis;
- feature engineering: describing queries from their SPARQL language
constituents.
TASK DETAIL
https://phabricator.wikimedia.org/T248308
EMAIL PREFERENCES
https://phabricator.wikimedia.org/settings/panel/emailpreferences/
To: GoranSMilovanovic
Cc: Addshore, Lydia_Pintscher, WMDE-leszek, Aklapper, darthmon_wmde, Nandana,
Lahi, Gq86, GoranSMilovanovic, QZanden, LawExplorer, _jensen, rosalieper,
Scott_WUaS, Wikidata-bugs, aude, Mbch331
_______________________________________________
Wikidata-bugs mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs