Summary: Include size of the dump file in each dump file
Product: XML Snapshots
Since dump files can be so huge it is common to process them while still
compressed. To enable progress reports when processing compressed dumps we need
to know the total size, which of course we can't know unless we decompress it.
It would be trivial to add a metadata field to say the root element which
states the uncompressed size of the dump file.
The easiest way would be to include it as a fixed length string, say 16
characters hexadecimal which would allow for 64-bits. When intially generating
the dump this field would be set to "0000000000000000". After completion of
dump generation we now know how long it is and can go back and fill in this
field without altering the length of the dump.
Of course if we generate dump files directly to their compressed form then this
may not be possible.
Depending on how we generate the dumps we might know how many lines they have,
which would also be very useful for those of us processing them line by line.
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are on the CC list for the bug.
Wikibugs-l mailing list