It seems that the pagecounts-ez sets disappeared from dumps.wikimedia.org starting this date. Is that a coincidence ? Is it https://phabricator.wikimedia.org/T189283 perhaps ?
DJ On Thu, Mar 29, 2018 at 2:42 PM, Ariel Glenn WMF <ar...@wikimedia.org> wrote: > Here it comes: > > For the April 1st run and all following runs, the Wikidata dumps of > pages-meta-current.bz2 will be produced only as separate downloadable > files, no recombined single file will be produced. > > No other dump jobs will be impacted. > > A reminder that each of the single downloadable pieces has the siteinfo > header and the mediawiki footer so they may all be processed separately by > whatever tools you use to grab data out of the combined file. If your > workflow supports it, they may even be processed in parallel. > > I am still looking into what the best approach is for the pags-articles > dumps. > > Please forward wherever you deem appropriate. For further updates, don't > forget to check the Phab ticket! https://phabricator.wikimedia.org/T179059 > > On Mon, Mar 19, 2018 at 2:00 PM, Ariel Glenn WMF <ar...@wikimedia.org> > wrote: > >> A reprieve! Code's not ready and I need to do some timing tests, so the >> March 20th run will do the standard recombining. >> >> For updates, don't forget to check the Phab ticket! >> https://phabricator.wikimedia.org/T179059 >> >> On Mon, Mar 5, 2018 at 1:10 PM, Ariel Glenn WMF <ar...@wikimedia.org> >> wrote: >> >>> Please forward wherever you think appropriate. >>> >>> For some time we have provided multiple numbered pages-articles bz2 file >>> for large wikis, as well as a single file with all of the contents combined >>> into one. This is consuming enough time for Wikidata that it is no longer >>> sustainable. For wikis where the sizes of these files to recombine is "too >>> large", we will skip this recombine step. This means that downloader >>> scripts relying on this file will need to check its existence, and if it's >>> not there, fall back to downloading the multiple numbered files. >>> >>> I expect to get this done and deployed by the March 20th dumps run. You >>> can follow along here: https://phabricator.wikimedia.org/T179059 >>> >>> Thanks! >>> >>> Ariel >>> >> >> > _______________________________________________ > Wikitech-l mailing list > Wikitechfirstname.lastname@example.org > https://lists.wikimedia.org/mailman/listinfo/wikitech-l _______________________________________________ Wikitech-l mailing list Wikitechemail@example.com https://lists.wikimedia.org/mailman/listinfo/wikitech-l