https://bugzilla.wikimedia.org/show_bug.cgi?id=26499
--- Comment #14 from Ariel T. Glenn <[email protected]> 2011-08-29 19:39:55 UTC --- Yeah, I'm familiar with seek-bzip2, but it didn't do what I needed for my use case. I wanted to be able to easily locate a given XML page in a dump file without an index. The gzip tool appears to read through the entire file (and then keep it in memory) for random access, something we wouldn't want to do for large files like the en wikipedia dumps. Another approach is to make each page a separate bzip2 stream; I haven't decided whether that's a good thing or not (and it too would require reworking a bunch of thiings that aren't designed to handle multiple streams). -- Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email ------- You are receiving this mail because: ------- You are on the CC list for the bug. _______________________________________________ Wikibugs-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikibugs-l
