JAllemandou created this task.
JAllemandou added a project: Wikidata-Query-Service.
Restricted Application added a subscriber: Aklapper.

TASK DESCRIPTION
  As a way to get familiar with the data, please provide quantitative 
information over the dataset using spark in a notebook (probably using python 
as it facilitates making charts).
  The data can be found in:
  
    
hdfs://analytics-hadoop/wmf/data/discovery/wikidata/rdf/date=20210419/wiki=wikidata
  
  There are multiple snapshot date available, as well as multiple wikis 
(`wikidata` and `commons`). Just pick one date with `wikidata` data :)
  In hive or spark-sql:
  
    use discovery;
    show partitions wikibase_rdf;

TASK DETAIL
  https://phabricator.wikimedia.org/T282139

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: JAllemandou
Cc: CBogen, AKhatun_WMF, Aklapper, JAllemandou, MPhamWMF, Namenlos314, Gq86, 
Lucas_Werkmeister_WMDE, EBjune, merbst, Jonas, Xmlizer, jkroll, Wikidata-bugs, 
Jdouglas, aude, Tobias1984, Manybubbles
_______________________________________________
Wikidata-bugs mailing list -- wikidata-bugs@lists.wikimedia.org
To unsubscribe send an email to wikidata-bugs-le...@lists.wikimedia.org

Reply via email to