The issue of mirroring Wikimedia content has been discussed with a number of scholarly institutions engaged in data-rich research, and the response was generally of the "send us the specs, and we will see what we can do" kind.
I would be interested in giving this another go if someone could provide me with those specs, preferably for Wikimedia projects as a whole as well as broken down by individual projects or languages or timestamps etc. The WikiTeam's Commons archive would make for a good test dataset. Daniel -- http://www.naturkundemuseum-berlin.de/en/institution/mitarbeiter/mietchen-daniel/ https://en.wikipedia.org/wiki/User:Daniel_Mietchen/Publications http://okfn.org http://wikimedia.org On Fri, Aug 1, 2014 at 4:42 PM, Federico Leva (Nemo) <[email protected]> wrote: > WikiTeam[1] has released an update of the chronological archive of all > Wikimedia Commons files, up to 2013. Now at ~34 TB total. > <https://archive.org/details/wikimediacommons> > I wrote to – I think – all the mirrors in the world, but apparently > nobody is interested in such a mass of media apart from the Internet > Archive (and the mirrorservice.org which took Kiwix). > The solution is simple: take a small bite and preserve a copy > yourself. > One slice only takes one click, from your browser to your torrent > client, and typically 20-40 GB on your disk (biggest slice 1400 GB, > smallest 216 MB). > <https://en.wikipedia.org/wiki/User:Emijrp/Wikipedia_Archive#Image_tarballs> > > Nemo > > P.s.: Please help spread the word everywhere. > > [1] https://github.com/WikiTeam/wikiteam > > _______________________________________________ > Commons-l mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/commons-l _______________________________________________ Wikitech-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikitech-l
