dcausse added a comment.

  Indeed, the RDF data is available in the hive table `discovery.wikibase_rdf` 
but it is generated reading the TTL dumps so it might not help for this 
particular task.
  Using hadoop will indeed allow to process the json efficiently but has 
drawbacks as already pointed out:
  
  - requires maintaining the Wikibase -> RDF projection in multiple codebases 
(PHP wikibase & in spark)
  - once created from the hadoop cluster it will have to be pushed back to the 
labstore machine for public consumption and might add extra delay

TASK DETAIL
  https://phabricator.wikimedia.org/T94019

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: dcausse
Cc: dcausse, Addshore, toan, Tonina_Zhelyazkova_WMDE, JAllemandou, Pintoch, 
Smalyshev, hoo, Liuxinyu970226, mkroetzsch, Aklapper, daniel, Invadibot, 
maantietaja, Alter-paule, Beast1978, Un1tY, Akuckartz, Hook696, Kent7301, 
joker88john, CucyNoiD, Nandana, Gaboe420, Giuliamocci, Cpaulf30, Lahi, Gq86, 
Af420, Bsandipan, GoranSMilovanovic, QZanden, LawExplorer, Lewizho99, 
Maathavan, _jensen, rosalieper, Scott_WUaS, Jonas, Wikidata-bugs, aude, 
Lydia_Pintscher, Mbch331
_______________________________________________
Wikidata-bugs mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs

Reply via email to