Addshore added a comment.

  When T120242: Consistent MediaWiki state change events | MediaWiki events as 
source of truth <https://phabricator.wikimedia.org/T120242> is ready we could 
probably change some of the architecture and process around dumping for 
Wikidata.org
  
  We would likely keep the existing scripts as they are for the Wikibase 
usecases, and may still want to create a script that generates TTL/RDF from a 
JSON dump
  
  For Wikidata.org we could move towards
  `edit--> kafka --> streaming job --> WMF-API(get content) --> store on HDFS`
  And then from HDFS generate JSON, RDF, TTL dumps much faster and consistently
  
  This will likely tie into the ongoing wikidata / wikibase subsetting 
discussions too, as subsetting dumps from HDFS will be much easier than while 
using the existing systems.
  See T46581: Partial dumps <https://phabricator.wikimedia.org/T46581> etc.
  
  But most of this probably lives in separate tickets.

TASK DETAIL
  https://phabricator.wikimedia.org/T94019

EMAIL PREFERENCES
  https://phabricator.wikimedia.org/settings/panel/emailpreferences/

To: Addshore
Cc: dcausse, Addshore, toan, Tonina_Zhelyazkova_WMDE, JAllemandou, Pintoch, 
Smalyshev, hoo, Liuxinyu970226, mkroetzsch, Aklapper, daniel, Biggs657, 
Invadibot, Lalamarie69, maantietaja, Juan90264, Alter-paule, Beast1978, Un1tY, 
Akuckartz, Hook696, Kent7301, joker88john, CucyNoiD, Nandana, Gaboe420, 
Giuliamocci, Cpaulf30, Lahi, Gq86, Af420, Bsandipan, GoranSMilovanovic, 
QZanden, LawExplorer, Lewizho99, Maathavan, _jensen, rosalieper, Scott_WUaS, 
Wikidata-bugs, aude, Mbch331
_______________________________________________
Wikidata-bugs mailing list -- [email protected]
To unsubscribe send an email to [email protected]

Reply via email to