Ariel T. Glenn <ariel <at> wikimedia.org> writes:

> 
> We now have a copy of the dumps on a backup host.  Although we are still
> resolving hardware issues on the XML dumps server, we think it is safe
> enough to serve the existing dumps read-only.  DNS was updated to that
> effect already; people should see the dumps within the hour.  
> 
> Ariel
> 

Hi, thank you for working so hard on this issue, but I'm still having trouble 
with the latest en.wikipedia dump, however. I downloaded 
http://download.wikimedia.org/enwiki/20101011/enwiki-20101011-pages-
articles.xml.bz2 and am running into trouble decompressing.

In particular, bzip2 -d enwiki-20101011-pages-articles.xml.bz2 fails.

And bzip2 -tvv enwiki-20101011-pages-articles.xml.bz2 reports:

    [2752: huff+mtf data integrity (CRC) error in data

I ran bzip2recover & then bzip2 -t rec* and got the following:

bzip2: rec02752enwiki-20101011-pages-articles.xml.bz2: data integrity (CRC) 
error in data
bzip2: rec08881enwiki-20101011-pages-articles.xml.bz2: data integrity (CRC) 
error in data
bzip2: rec26198enwiki-20101011-pages-articles.xml.bz2: data integrity (CRC) 
error in data



_______________________________________________
Wikitech-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to