mkroetzsch added a comment. @Lydia_Pintscher I understand this problem, but if you put different dumps for different times all in one directory, won't this become quite big over time and hard to use? Maybe one should group dumps by how often they are created (and have date-directories only below that). For some cases, there does not seem to be any problem. For example, creating all RDF dumps from the JSON dump takes about 3-6h in total (on labs). So this is easily doable on the same day as the JSON dump generation. I am sure that we could also generate alternative JSON dumps in comparable time (maybe add an hour to the RDF if you do it in one batch). The slow part seems to be the DB export that leads to the first JSON dump -- once you have this the other formats should be relatively quick to do.
TASK DETAIL https://phabricator.wikimedia.org/T72385 REPLY HANDLER ACTIONS Reply to comment or attach files, or !close, !claim, !unsubscribe or !assign <username>. EMAIL PREFERENCES https://phabricator.wikimedia.org/settings/panel/emailpreferences/ To: ArielGlenn, mkroetzsch Cc: Manybubbles, JanZerebecki, Smalyshev, aude, daniel, Wikidata-bugs, Nemo_bis, mkroetzsch, Svick, ArielGlenn, Lydia_Pintscher, hoo, jeremyb _______________________________________________ Wikidata-bugs mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikidata-bugs
