Jamie Morken wrote: > > Hi, > >> What do you mean by "opening"? >> enwiki pages-meta-history is hard due to its size, not because >> Ariel or >> Tomasz being more stupid than any volunteer. >> I trust them to do it at least as well as a volunteer would. >> Of course, if you can perform better I'm all for giving you a >> shell to >> fix it, and the scripts are there for improvements as well. > > I wasn't aware that the dump scripts were publicly available, where can they > be downloaded from or are they part of mediawiki?
It is in http://svn.wikimedia.org/viewvc/mediawiki/trunk/backup/ although the files look a bit old, so perhaps there are some uncommitted changes? /me looks for offenders >> What do you need exactly about the images? Which image dumps do you >> want? Do you have enough terabytes to store them? >> Dumps/Access has been given by request in the past to that data. >> If it's not there it's because: >> a) Those dumps would take a lot of space. > > I don't think that is a valid reason, thumbnail dumps of all the >images from enwiki would probably be a smaller file than the current >enwiki pages-meta-history bz2 file. We have thumbs on lots of sizes. Which size do you want the thumbs? It's easy to tar all the images used on a wiki, since that's tracked in the database, but not at all knowing which exact size was each of them used. enwiki has a total of 858979 local files which sum 229 GB (and there's still commmons). 2357967 unique images (37050694 uses) are in their articles. Assuming 20Kb per image thumb (is that a good value?), that's 48 Gb, more than the 31.9 GB of the (really compressed) pages-meta-history.xml.7z but we would need to agree. They would tie at 14 Kb. Even if all thumbs were unrealistically small, 1Kb each, they would still be several GB. > b) Nobody feels particulary interested in them. > I disagree, there has been a lot of interest in having image dumps >available for download. There was a discussion on this recently on the >xmldatadumps list, that basically concluded that subsets of images >(ie. enwiki thumbnails) would be useful. I am unable to find it, although a thread like that somewhere rings a bell to me. > There are wiki pages dedicated >to this topic of how to download images, this is because there are no >image dumps available. Is the wikimedia foundation interested to host >image dumps again? If they are maybe we can start a discussion on how >to make the script and what image dumps to start with. > > cheers, > Jamie _______________________________________________ Wikitech-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikitech-l
