If the md5s don't match, the files are obviously different, I mean, one of them is corrupt.
What is the size of your local file? I use to download dumps with wget UNIX command and I don't get errors. If you are using FAT32, the file size is limited to 2 GB and the file is truncated. Is your case? 2010/12/16 Gabriel Weinberg <[email protected]> > md5sum doesn't match. I get e74170eaaedc65e02249e1a54b1087cb (as > opposed to 7a4805475bba1599933b3acd5150bd4d > on > http://download.wikimedia.org/enwiki/20101011/enwiki-20101011-md5sums.txt > ). > > I've downloaded it twice now and have gotten the same md5sum. Can anyone > else confirm? > > On Thu, Dec 16, 2010 at 5:41 PM, emijrp <[email protected]> wrote: > > > Have you checked the md5sum? > > > > 2010/12/16 Gabriel Weinberg <[email protected]> > > > > > Ariel T. Glenn <ariel <at> wikimedia.org> writes: > > > > > > > > > > > We now have a copy of the dumps on a backup host. Although we are > > still > > > > resolving hardware issues on the XML dumps server, we think it is > safe > > > > enough to serve the existing dumps read-only. DNS was updated to > that > > > > effect already; people should see the dumps within the hour. > > > > > > > > Ariel > > > > > > > > > > Hi, thank you for working so hard on this issue, but I'm still having > > > trouble > > > with the latest en.wikipedia dump, however. I downloaded > > > http://download.wikimedia.org/enwiki/20101011/enwiki-20101011-pages- > > > articles.xml.bz2 and am running into trouble decompressing. > > > > > > In particular, bzip2 -d enwiki-20101011-pages-articles.xml.bz2 fails. > > > > > > And bzip2 -tvv enwiki-20101011-pages-articles.xml.bz2 reports: > > > > > > [2752: huff+mtf data integrity (CRC) error in data > > > > > > I ran bzip2recover & then bzip2 -t rec* and got the following: > > > > > > bzip2: rec02752enwiki-20101011-pages-articles.xml.bz2: data integrity > > (CRC) > > > error in data > > > bzip2: rec08881enwiki-20101011-pages-articles.xml.bz2: data integrity > > (CRC) > > > error in data > > > bzip2: rec26198enwiki-20101011-pages-articles.xml.bz2: data integrity > > (CRC) > > > error in data > > > > > > > > > > > > _______________________________________________ > > > Wikitech-l mailing list > > > [email protected] > > > https://lists.wikimedia.org/mailman/listinfo/wikitech-l > > > > > _______________________________________________ > > Wikitech-l mailing list > > [email protected] > > https://lists.wikimedia.org/mailman/listinfo/wikitech-l > > > _______________________________________________ > Wikitech-l mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/wikitech-l > _______________________________________________ Wikitech-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikitech-l
