On Sat, Jan 10, 2009 at 9:14 AM, Keisial <[email protected]> wrote: > bzipping the pages by blocks as I did for my offline reader produces a > file size similar to the the original* > There may be ways to get similar results without having to rebuild the > revisions. > Also note that in both cases you still need an intermediate app to > provide input dumps for those tools. > > *112% measuring enwiki-20081008-pages-meta-current. Looking at > ruwiki-20081228-history, both the original bz2 and my faster-access one > are 8.2G.
-history dumps and one off page dumps are pretty distinct cases: The history dumps have a lot more available redundancy. For fast access articles you might want to consider compressing articles one-off with a a dictionary based pre-pass such as http://xwrt.sourceforge.net/ _______________________________________________ Wikitech-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikitech-l
