JAllemandou created this task. JAllemandou added a project: Wikidata-Query-Service. Restricted Application added a subscriber: Aklapper.
TASK DESCRIPTION As a way to get familiar with the data, please provide quantitative information over the dataset using spark in a notebook (probably using python as it facilitates making charts). The data can be found in: hdfs://analytics-hadoop/wmf/data/discovery/wikidata/rdf/date=20210419/wiki=wikidata There are multiple snapshot date available, as well as multiple wikis (`wikidata` and `commons`). Just pick one date with `wikidata` data :) In hive or spark-sql: use discovery; show partitions wikibase_rdf; TASK DETAIL https://phabricator.wikimedia.org/T282139 EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: JAllemandou Cc: CBogen, AKhatun_WMF, Aklapper, JAllemandou, MPhamWMF, Namenlos314, Gq86, Lucas_Werkmeister_WMDE, EBjune, merbst, Jonas, Xmlizer, jkroll, Wikidata-bugs, Jdouglas, aude, Tobias1984, Manybubbles
_______________________________________________ Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org