David Gerard schrieb:
> 2009/4/10 Jameson Scanlon <[email protected]>:
> 
>> Does anyone on the wikitech mailing list happen to know whether it
>> would be possible for some of the larger wikipedia database downloads
>> (which are, say, 16GB or so in size) to be split into parts so that
>> they can be downloaded.  For whatever reason, whenever I have
>> attempted to download the ~14GB files (say, from
>> http://static.wikipedia.org/downloads/2008-06/en/ ), I have found that
>> only 2GB (presumably, the first 2GB) of what I have sought to download
>> has actually been downloaded.  Is there anyway around this?  Could
>> anyone possibly suggest what possible reasons there might be for this
>> difficulty in downloading the material?
> 
> 
> Downloading to a filesystem that only does maximum 2GB files?
> 

Also, several http clients don't like files over 2GB - this is because the large
number of bytes in the Length field causes an integer overflow (2GB is the 31
bit limit). wget likes to die with a segmentation fault on those. I found that
curl works.

But of course, the file system also has to support very large files, as Gerard 
said.

Finally: yes, it would be nive to have such dumps available in pieces of perhaps
1GB in size.

-- daniel

_______________________________________________
Wikitech-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to