[Bug 26499] Include uncompressed size and other metadata in each dump file

bugzilla-daemon Mon, 29 Aug 2011 12:40:00 -0700

https://bugzilla.wikimedia.org/show_bug.cgi?id=26499


--- Comment #14 from Ariel T. Glenn <[email protected]> 2011-08-29 19:39:55 
UTC ---
Yeah, I'm familiar with seek-bzip2, but it didn't do what I needed for my use
case.  I wanted to be able to easily locate a given XML page in a dump file
without an index. The gzip tool appears to read through the entire file (and
then keep it in memory) for random access, something we wouldn't want to do for
large files like the en wikipedia dumps. 

Another approach is to make each page a separate bzip2 stream; I haven't
decided whether that's a good thing or not (and it too would require reworking
a bunch of thiings that aren't designed to handle multiple streams).

-- 
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.

_______________________________________________
Wikibugs-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l

[Bug 26499] Include uncompressed size and other metadata in each dump file

Reply via email to