Diederik van Liere wrote: > To continue the discussion on how to improve the performance, would it be > possible to distribute the dumps as a 7z / gz / other format archive > containing multiple smaller XML files. It's quite tricky to split a very > large XML file in smaller valid XML files and if the dumping process is > already parallelized then we do not have to cat the different XML files to > one large XML file but instead we can distribute multiple smaller > parallelized files . > > best, > > Diederik
That has already been done for enwiki. _______________________________________________ Wikitech-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikitech-l
