This is a place where volunteers can step in and make it happen without the need for Wikimedia's infrastructure. (This means I can concentrate on my already very full plate of things too.)
http://meta.wikimedia.org/wiki/Data_dump_torrents Have at! Ariel Στις 05-06-2012, ημέρα Τρι, και ώρα 08:57 -0400, ο/η Derric Atzrott έγραψε: > I second this idea. Large archives should always be available using > bittorrent. I would actually suggest posting magnet links for them though > instead of .torrent files. This way you can leverage the acceptable source > feature of magnet links. > > https://en.wikipedia.org/wiki/Magnet_URI_scheme#Web_links_to_the_file > > This way we get the best of both worlds: the constant availability of direct > downloads, and the reduction in load that p2p filesharing provides. > > Thank you, > Derric Atzrott > > -----Original Message----- > From: wikitech-l-boun...@lists.wikimedia.org > [mailto:wikitech-l-boun...@lists.wikimedia.org] On Behalf Of Oren Bochman > Sent: 05 June 2012 08:44 > To: 'Wikimedia developers' > Subject: Re: [Wikitech-l] [Xmldatadumps-l] XML dumps/Media mirrors update > > Any chance that these archived can be served via bittorent - so that even > partial downloaders can become servers - leveraging p2p to reduce overall > bandwidth load on the servers and increase download times? > > > -----Original Message----- > From: wikitech-l-boun...@lists.wikimedia.org > [mailto:wikitech-l-boun...@lists.wikimedia.org] On Behalf Of Mike Dupont > Sent: Saturday, June 02, 2012 1:28 AM > To: Wikimedia developers; wikiteam-disc...@googlegroups.com > Subject: Re: [Wikitech-l] [Xmldatadumps-l] XML dumps/Media mirrors update > > I have run cron archiving now every 30 minutes, > http://ia700802.us.archive.org/34/items/wikipedia-delete-2012-06/ > it is amazing how fast the stuff gets deleted on wikipedia. > what about the proposed deletes are there categories for that? > thanks > mike > > On Wed, May 30, 2012 at 6:26 AM, Mike Dupont > <jamesmikedup...@googlemail.com> wrote: > > https://github.com/h4ck3rm1k3/wikiteam code here > > > > On Wed, May 30, 2012 at 6:26 AM, Mike Dupont > > <jamesmikedup...@googlemail.com> wrote: > >> Ok, I merged the code from wikteam and have a full history dump > >> script that uploads to archive.org, next step is to fix the bucket > >> metadata in the script mike > >> > >> On Tue, May 29, 2012 at 3:08 AM, Mike Dupont > >> <jamesmikedup...@googlemail.com> wrote: > >>> Well, I have now updated the script to include the xml dump in raw > >>> format. I will have to add more information the achive.org item, at > >>> least a basic readme. > >>> other thing is that the wikipybot does not support the full history > >>> it seems, so that I will have to move over to the wikiteam version > >>> and rework it, I just spent 2 hours on this so i am pretty happy for > >>> the first version. > >>> > >>> mike > >>> > >>> On Tue, May 29, 2012 at 1:52 AM, Hydriz Wikipedia <ad...@alphacorp.tk> > >>> wrote: > >>>> This is quite nice, though the item's metadata is too little :) > >>>> > >>>> On Tue, May 29, 2012 at 3:40 AM, Mike Dupont > >>>> <jamesmikedup...@googlemail.com > >>>>> wrote: > >>>> > >>>>> first version of the Script is ready , it gets the versions, puts > >>>>> them in a zip and puts that on archive.org > >>>>> https://github.com/h4ck3rm1k3/pywikipediabot/blob/master/export_de > >>>>> leted.py > >>>>> > >>>>> here is an example output : > >>>>> http://archive.org/details/wikipedia-delete-2012-05 > >>>>> > >>>>> http://ia601203.us.archive.org/24/items/wikipedia-delete-2012-05/a > >>>>> rchive2012-05-28T21:34:02.302183.zip > >>>>> > >>>>> I will cron this, and it should give a start of saving deleted data. > >>>>> Articles will be exported once a day, even if they they were > >>>>> exported yesterday as long as they are in one of the categories. > >>>>> > >>>>> mike > >>>>> > >>>>> On Mon, May 21, 2012 at 7:21 PM, Mike Dupont > >>>>> <jamesmikedup...@googlemail.com> wrote: > >>>>> > Thanks! and run that 1 time per day, they dont get deleted that > >>>>> > quickly. > >>>>> > mike > >>>>> > > >>>>> > On Mon, May 21, 2012 at 9:11 PM, emijrp <emi...@gmail.com> wrote: > >>>>> >> Create a script that makes a request to Special:Export using > >>>>> >> this > >>>>> category > >>>>> >> as feed > >>>>> >> https://en.wikipedia.org/wiki/Category:Candidates_for_speedy_de > >>>>> >> letion > >>>>> >> > >>>>> >> More info > >>>>> https://www.mediawiki.org/wiki/Manual:Parameters_to_Special:Export > >>>>> >> > >>>>> >> > >>>>> >> 2012/5/21 Mike Dupont <jamesmikedup...@googlemail.com> > >>>>> >>> > >>>>> >>> Well I whould be happy for items like this : > >>>>> >>> http://en.wikipedia.org/wiki/Template:Db-a7 > >>>>> >>> would it be possible to extract them easily? > >>>>> >>> mike > >>>>> >>> > >>>>> >>> On Thu, May 17, 2012 at 2:23 PM, Ariel T. Glenn > >>>>> >>> <ar...@wikimedia.org> > >>>>> >>> wrote: > >>>>> >>> > There's a few other reasons articles get deleted: copyright > >>>>> >>> > issues, personal identifying data, etc. This makes > >>>>> >>> > maintaning the sort of mirror you propose problematic, although a > >>>>> >>> > similar mirror is here: > >>>>> >>> > http://deletionpedia.dbatley.com/w/index.php?title=Main_Page > >>>>> >>> > > >>>>> >>> > The dumps contain only data publically available at the time > >>>>> >>> > of the > >>>>> run, > >>>>> >>> > without deleted data. > >>>>> >>> > > >>>>> >>> > The articles aren't permanently deleted of course. The > >>>>> >>> > revisions > >>>>> texts > >>>>> >>> > live on in the database, so a query on toolserver, for > >>>>> >>> > example, > >>>>> could be > >>>>> >>> > used to get at them, but that would need to be for research > >>>>> >>> > purposes. > >>>>> >>> > > >>>>> >>> > Ariel > >>>>> >>> > > >>>>> >>> > Στις 17-05-2012, ημέρα Πεμ, και ώρα 13:30 +0200, ο/η Mike > >>>>> >>> > Dupont > >>>>> έγραψε: > >>>>> >>> >> Hi, > >>>>> >>> >> I am thinking about how to collect articles deleted based > >>>>> >>> >> on the > >>>>> "not > >>>>> >>> >> notable" criteria, > >>>>> >>> >> is there any way we can extract them from the mysql > >>>>> >>> >> binlogs? how are these mirrors working? I would be > >>>>> >>> >> interested in setting up a mirror > >>>>> of > >>>>> >>> >> deleted data, at least that which is not spam/vandalism > >>>>> >>> >> based on > >>>>> tags. > >>>>> >>> >> mike > >>>>> >>> >> > >>>>> >>> >> On Thu, May 17, 2012 at 1:09 PM, Ariel T. Glenn < > >>>>> ar...@wikimedia.org> > >>>>> >>> >> wrote: > >>>>> >>> >> > We now have three mirror sites, yay! The full list is > >>>>> >>> >> > linked to > >>>>> from > >>>>> >>> >> > http://dumps.wikimedia.org/ and is also available at > >>>>> >>> >> > > >>>>> >>> >> > > >>>>> http://meta.wikimedia.org/wiki/Mirroring_Wikimedia_project_XML_dum > >>>>> ps#Current_Mirrors > >>>>> >>> >> > > >>>>> >>> >> > Summarizing, we have: > >>>>> >>> >> > > >>>>> >>> >> > C3L (Brazil) with the last 5 good known dumps, Masaryk > >>>>> >>> >> > University (Czech Republic) with the last 5 known good > >>>>> dumps, > >>>>> >>> >> > Your.org (USA) with the complete archive of dumps, and > >>>>> >>> >> > > >>>>> >>> >> > for the latest version of uploaded media, Your.org with > >>>>> >>> >> > http/ftp/rsync access. > >>>>> >>> >> > > >>>>> >>> >> > Thanks to Carlos, Kevin and Yenya respectively at the > >>>>> >>> >> > above sites > >>>>> for > >>>>> >>> >> > volunteering space, time and effort to make this happen. > >>>>> >>> >> > > >>>>> >>> >> > As people noticed earlier, a series of media tarballs > >>>>> >>> >> > per-project (excluding commons) is being generated. As > >>>>> >>> >> > soon as the first run > >>>>> of > >>>>> >>> >> > these is complete we'll announce its location and start > >>>>> >>> >> > generating them on a semi-regular basis. > >>>>> >>> >> > > >>>>> >>> >> > As we've been getting the bugs out of the mirroring > >>>>> >>> >> > setup, it is getting easier to add new locations. Know > >>>>> >>> >> > anyone interested? Please let > >>>>> us > >>>>> >>> >> > know; we would love to have them. > >>>>> >>> >> > > >>>>> >>> >> > Ariel > >>>>> >>> >> > > >>>>> >>> >> > > >>>>> >>> >> > _______________________________________________ > >>>>> >>> >> > Wikitech-l mailing list > >>>>> >>> >> > Wikitech-l@lists.wikimedia.org > >>>>> >>> >> > https://lists.wikimedia.org/mailman/listinfo/wikitech-l > >>>>> >>> >> > >>>>> >>> >> > >>>>> >>> >> > >>>>> >>> > > >>>>> >>> > > >>>>> >>> > > >>>>> >>> > _______________________________________________ > >>>>> >>> > Wikitech-l mailing list > >>>>> >>> > Wikitech-l@lists.wikimedia.org > >>>>> >>> > https://lists.wikimedia.org/mailman/listinfo/wikitech-l > >>>>> >>> > >>>>> >>> > >>>>> >>> > >>>>> >>> -- > >>>>> >>> James Michael DuPont > >>>>> >>> Member of Free Libre Open Source Software Kosova > >>>>> >>> http://flossk.org Contributor FOSM, the CC-BY-SA map of the > >>>>> >>> world http://fosm.org Mozilla Rep > >>>>> >>> https://reps.mozilla.org/u/h4ck3rm1k3 > >>>>> >>> > >>>>> >>> _______________________________________________ > >>>>> >>> Wikitech-l mailing list > >>>>> >>> Wikitech-l@lists.wikimedia.org > >>>>> >>> https://lists.wikimedia.org/mailman/listinfo/wikitech-l > >>>>> >> > >>>>> >> > >>>>> >> > >>>>> >> > >>>>> >> -- > >>>>> >> Emilio J. Rodríguez-Posada. E-mail: emijrp AT gmail DOT com > >>>>> >> Pre-doctoral student at the University of Cádiz (Spain) > >>>>> >> Projects: AVBOT | StatMediaWiki | WikiEvidens | WikiPapers | > >>>>> >> WikiTeam Personal website: > >>>>> >> https://sites.google.com/site/emijrp/ > >>>>> >> > >>>>> >> > >>>>> >> _______________________________________________ > >>>>> >> Xmldatadumps-l mailing list > >>>>> >> xmldatadump...@lists.wikimedia.org > >>>>> >> https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l > >>>>> >> > >>>>> > > >>>>> > > >>>>> > > >>>>> > -- > >>>>> > James Michael DuPont > >>>>> > Member of Free Libre Open Source Software Kosova > >>>>> > http://flossk.org Contributor FOSM, the CC-BY-SA map of the > >>>>> > world http://fosm.org Mozilla Rep > >>>>> > https://reps.mozilla.org/u/h4ck3rm1k3 > >>>>> > >>>>> > >>>>> > >>>>> -- > >>>>> James Michael DuPont > >>>>> Member of Free Libre Open Source Software Kosova http://flossk.org > >>>>> Contributor FOSM, the CC-BY-SA map of the world http://fosm.org > >>>>> Mozilla Rep https://reps.mozilla.org/u/h4ck3rm1k3 > >>>>> > >>>>> _______________________________________________ > >>>>> Wikitech-l mailing list > >>>>> Wikitech-l@lists.wikimedia.org > >>>>> https://lists.wikimedia.org/mailman/listinfo/wikitech-l > >>>>> > >>>> > >>>> > >>>> > >>>> -- > >>>> Regards, > >>>> Hydriz > >>>> > >>>> We've created the greatest collection of shared knowledge in > >>>> history. Help protect Wikipedia. Donate now: > >>>> http://donate.wikimedia.org > >>>> _______________________________________________ > >>>> Wikitech-l mailing list > >>>> Wikitech-l@lists.wikimedia.org > >>>> https://lists.wikimedia.org/mailman/listinfo/wikitech-l > >>> > >>> > >>> > >>> -- > >>> James Michael DuPont > >>> Member of Free Libre Open Source Software Kosova http://flossk.org > >>> Contributor FOSM, the CC-BY-SA map of the world http://fosm.org > >>> Mozilla Rep https://reps.mozilla.org/u/h4ck3rm1k3 > >> > >> > >> > >> -- > >> James Michael DuPont > >> Member of Free Libre Open Source Software Kosova http://flossk.org > >> Contributor FOSM, the CC-BY-SA map of the world http://fosm.org > >> Mozilla Rep https://reps.mozilla.org/u/h4ck3rm1k3 > > > > > > > > -- > > James Michael DuPont > > Member of Free Libre Open Source Software Kosova http://flossk.org > > Contributor FOSM, the CC-BY-SA map of the world http://fosm.org > > Mozilla Rep https://reps.mozilla.org/u/h4ck3rm1k3 > > > > -- > James Michael DuPont > Member of Free Libre Open Source Software Kosova http://flossk.org > Contributor FOSM, the CC-BY-SA map of the world http://fosm.org Mozilla Rep > https://reps.mozilla.org/u/h4ck3rm1k3 > > _______________________________________________ > Wikitech-l mailing list > Wikitech-l@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/wikitech-l > > > _______________________________________________ > Wikitech-l mailing list > Wikitech-l@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/wikitech-l > > > _______________________________________________ > Wikitech-l mailing list > Wikitech-l@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/wikitech-l _______________________________________________ Wikitech-l mailing list Wikitech-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/wikitech-l