This is a place where volunteers can step in and make it happen without
the need for Wikimedia's infrastructure.  (This means I can concentrate
on my already very full plate of things too.)

http://meta.wikimedia.org/wiki/Data_dump_torrents

Have at!

Ariel

Στις 05-06-2012, ημέρα Τρι, και ώρα 08:57 -0400, ο/η Derric Atzrott
έγραψε:
> I second this idea.  Large archives should always be available using 
> bittorrent.  I would actually suggest posting magnet links for them though 
> instead of .torrent files.  This way you can leverage the acceptable source 
> feature of magnet links.
> 
> https://en.wikipedia.org/wiki/Magnet_URI_scheme#Web_links_to_the_file
> 
> This way we get the best of both worlds: the constant availability of direct 
> downloads, and the reduction in load that p2p filesharing provides.
> 
> Thank you,
> Derric Atzrott
> 
> -----Original Message-----
> From: wikitech-l-boun...@lists.wikimedia.org 
> [mailto:wikitech-l-boun...@lists.wikimedia.org] On Behalf Of Oren Bochman
> Sent: 05 June 2012 08:44
> To: 'Wikimedia developers'
> Subject: Re: [Wikitech-l] [Xmldatadumps-l] XML dumps/Media mirrors update
> 
> Any chance that these archived can be served via bittorent - so that even 
> partial downloaders can become servers - leveraging p2p to reduce overall 
> bandwidth load on the servers and increase download times?
> 
> 
> -----Original Message-----
> From: wikitech-l-boun...@lists.wikimedia.org 
> [mailto:wikitech-l-boun...@lists.wikimedia.org] On Behalf Of Mike Dupont
> Sent: Saturday, June 02, 2012 1:28 AM
> To: Wikimedia developers; wikiteam-disc...@googlegroups.com
> Subject: Re: [Wikitech-l] [Xmldatadumps-l] XML dumps/Media mirrors update
> 
> I have run cron archiving now every 30 minutes, 
> http://ia700802.us.archive.org/34/items/wikipedia-delete-2012-06/
> it is amazing how fast the stuff gets deleted on wikipedia.
> what about the proposed deletes are there categories for that?
> thanks
> mike
> 
> On Wed, May 30, 2012 at 6:26 AM, Mike  Dupont 
> <jamesmikedup...@googlemail.com> wrote:
> > https://github.com/h4ck3rm1k3/wikiteam code here
> >
> > On Wed, May 30, 2012 at 6:26 AM, Mike  Dupont 
> > <jamesmikedup...@googlemail.com> wrote:
> >> Ok, I merged the code from wikteam and have a full history dump 
> >> script that uploads to archive.org, next step is to fix the bucket 
> >> metadata in the script mike
> >>
> >> On Tue, May 29, 2012 at 3:08 AM, Mike  Dupont 
> >> <jamesmikedup...@googlemail.com> wrote:
> >>> Well, I have now updated the script to include  the xml dump in raw 
> >>> format. I will have to add more information the achive.org item, at 
> >>> least a basic readme.
> >>> other thing is that the wikipybot does not support the full history 
> >>> it seems, so that I will have to move over to the wikiteam version 
> >>> and rework it, I just spent 2 hours on this so i am pretty happy for 
> >>> the first version.
> >>>
> >>> mike
> >>>
> >>> On Tue, May 29, 2012 at 1:52 AM, Hydriz Wikipedia <ad...@alphacorp.tk> 
> >>> wrote:
> >>>> This is quite nice, though the item's metadata is too little :)
> >>>>
> >>>> On Tue, May 29, 2012 at 3:40 AM, Mike Dupont 
> >>>> <jamesmikedup...@googlemail.com
> >>>>> wrote:
> >>>>
> >>>>> first version of the Script is ready , it gets the versions, puts 
> >>>>> them in a zip and puts that on archive.org 
> >>>>> https://github.com/h4ck3rm1k3/pywikipediabot/blob/master/export_de
> >>>>> leted.py
> >>>>>
> >>>>> here is an example output :
> >>>>> http://archive.org/details/wikipedia-delete-2012-05
> >>>>>
> >>>>> http://ia601203.us.archive.org/24/items/wikipedia-delete-2012-05/a
> >>>>> rchive2012-05-28T21:34:02.302183.zip
> >>>>>
> >>>>> I will cron this, and it should give a start of saving deleted data.
> >>>>> Articles will be exported once a day, even if they they were 
> >>>>> exported yesterday as long as they are in one of the categories.
> >>>>>
> >>>>> mike
> >>>>>
> >>>>> On Mon, May 21, 2012 at 7:21 PM, Mike  Dupont 
> >>>>> <jamesmikedup...@googlemail.com> wrote:
> >>>>> > Thanks! and run that 1 time per day, they dont get deleted that 
> >>>>> > quickly.
> >>>>> > mike
> >>>>> >
> >>>>> > On Mon, May 21, 2012 at 9:11 PM, emijrp <emi...@gmail.com> wrote:
> >>>>> >> Create a script that makes a request to Special:Export using 
> >>>>> >> this
> >>>>> category
> >>>>> >> as feed
> >>>>> >> https://en.wikipedia.org/wiki/Category:Candidates_for_speedy_de
> >>>>> >> letion
> >>>>> >>
> >>>>> >> More info
> >>>>> https://www.mediawiki.org/wiki/Manual:Parameters_to_Special:Export
> >>>>> >>
> >>>>> >>
> >>>>> >> 2012/5/21 Mike Dupont <jamesmikedup...@googlemail.com>
> >>>>> >>>
> >>>>> >>> Well I whould be happy for items like this :
> >>>>> >>> http://en.wikipedia.org/wiki/Template:Db-a7
> >>>>> >>> would it be possible to extract them easily?
> >>>>> >>> mike
> >>>>> >>>
> >>>>> >>> On Thu, May 17, 2012 at 2:23 PM, Ariel T. Glenn 
> >>>>> >>> <ar...@wikimedia.org>
> >>>>> >>> wrote:
> >>>>> >>> > There's a few other reasons articles get deleted: copyright 
> >>>>> >>> > issues, personal identifying data, etc.  This makes 
> >>>>> >>> > maintaning the sort of mirror you propose problematic, although a 
> >>>>> >>> > similar mirror is here:
> >>>>> >>> > http://deletionpedia.dbatley.com/w/index.php?title=Main_Page
> >>>>> >>> >
> >>>>> >>> > The dumps contain only data publically available at the time 
> >>>>> >>> > of the
> >>>>> run,
> >>>>> >>> > without deleted data.
> >>>>> >>> >
> >>>>> >>> > The articles aren't permanently deleted of course.  The 
> >>>>> >>> > revisions
> >>>>> texts
> >>>>> >>> > live on in the database, so a query on toolserver, for 
> >>>>> >>> > example,
> >>>>> could be
> >>>>> >>> > used to get at them, but that would need to be for research 
> >>>>> >>> > purposes.
> >>>>> >>> >
> >>>>> >>> > Ariel
> >>>>> >>> >
> >>>>> >>> > Στις 17-05-2012, ημέρα Πεμ, και ώρα 13:30 +0200, ο/η Mike 
> >>>>> >>> > Dupont
> >>>>> έγραψε:
> >>>>> >>> >> Hi,
> >>>>> >>> >> I am thinking about how to collect articles deleted based 
> >>>>> >>> >> on the
> >>>>> "not
> >>>>> >>> >> notable" criteria,
> >>>>> >>> >> is there any way we can extract them from the mysql 
> >>>>> >>> >> binlogs? how are these mirrors working? I would be 
> >>>>> >>> >> interested in setting up a mirror
> >>>>> of
> >>>>> >>> >> deleted data, at least that which is not spam/vandalism 
> >>>>> >>> >> based on
> >>>>> tags.
> >>>>> >>> >> mike
> >>>>> >>> >>
> >>>>> >>> >> On Thu, May 17, 2012 at 1:09 PM, Ariel T. Glenn <
> >>>>> ar...@wikimedia.org>
> >>>>> >>> >> wrote:
> >>>>> >>> >> > We now have three mirror sites, yay!  The full list is 
> >>>>> >>> >> > linked to
> >>>>> from
> >>>>> >>> >> > http://dumps.wikimedia.org/ and is also available at
> >>>>> >>> >> >
> >>>>> >>> >> >
> >>>>> http://meta.wikimedia.org/wiki/Mirroring_Wikimedia_project_XML_dum
> >>>>> ps#Current_Mirrors
> >>>>> >>> >> >
> >>>>> >>> >> > Summarizing, we have:
> >>>>> >>> >> >
> >>>>> >>> >> > C3L (Brazil) with the last 5 good known dumps, Masaryk 
> >>>>> >>> >> > University (Czech Republic) with the last 5 known good
> >>>>> dumps,
> >>>>> >>> >> > Your.org (USA) with the complete archive of dumps, and
> >>>>> >>> >> >
> >>>>> >>> >> > for the latest version of uploaded media, Your.org with 
> >>>>> >>> >> > http/ftp/rsync access.
> >>>>> >>> >> >
> >>>>> >>> >> > Thanks to Carlos, Kevin and Yenya respectively at the 
> >>>>> >>> >> > above sites
> >>>>> for
> >>>>> >>> >> > volunteering space, time and effort to make this happen.
> >>>>> >>> >> >
> >>>>> >>> >> > As people noticed earlier, a series of media tarballs 
> >>>>> >>> >> > per-project (excluding commons) is being generated.  As 
> >>>>> >>> >> > soon as the first run
> >>>>> of
> >>>>> >>> >> > these is complete we'll announce its location and start 
> >>>>> >>> >> > generating them on a semi-regular basis.
> >>>>> >>> >> >
> >>>>> >>> >> > As we've been getting the bugs out of the mirroring 
> >>>>> >>> >> > setup, it is getting easier to add new locations.  Know 
> >>>>> >>> >> > anyone interested?  Please let
> >>>>> us
> >>>>> >>> >> > know; we would love to have them.
> >>>>> >>> >> >
> >>>>> >>> >> > Ariel
> >>>>> >>> >> >
> >>>>> >>> >> >
> >>>>> >>> >> > _______________________________________________
> >>>>> >>> >> > Wikitech-l mailing list
> >>>>> >>> >> > Wikitech-l@lists.wikimedia.org 
> >>>>> >>> >> > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> >>>>> >>> >>
> >>>>> >>> >>
> >>>>> >>> >>
> >>>>> >>> >
> >>>>> >>> >
> >>>>> >>> >
> >>>>> >>> > _______________________________________________
> >>>>> >>> > Wikitech-l mailing list
> >>>>> >>> > Wikitech-l@lists.wikimedia.org 
> >>>>> >>> > https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> >>>>> >>>
> >>>>> >>>
> >>>>> >>>
> >>>>> >>> --
> >>>>> >>> James Michael DuPont
> >>>>> >>> Member of Free Libre Open Source Software Kosova 
> >>>>> >>> http://flossk.org Contributor FOSM, the CC-BY-SA map of the 
> >>>>> >>> world http://fosm.org Mozilla Rep
> >>>>> >>> https://reps.mozilla.org/u/h4ck3rm1k3
> >>>>> >>>
> >>>>> >>> _______________________________________________
> >>>>> >>> Wikitech-l mailing list
> >>>>> >>> Wikitech-l@lists.wikimedia.org 
> >>>>> >>> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> >>>>> >>
> >>>>> >>
> >>>>> >>
> >>>>> >>
> >>>>> >> --
> >>>>> >> Emilio J. Rodríguez-Posada. E-mail: emijrp AT gmail DOT com 
> >>>>> >> Pre-doctoral student at the University of Cádiz (Spain)
> >>>>> >> Projects: AVBOT | StatMediaWiki | WikiEvidens | WikiPapers | 
> >>>>> >> WikiTeam Personal website:
> >>>>> >> https://sites.google.com/site/emijrp/
> >>>>> >>
> >>>>> >>
> >>>>> >> _______________________________________________
> >>>>> >> Xmldatadumps-l mailing list
> >>>>> >> xmldatadump...@lists.wikimedia.org
> >>>>> >> https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l
> >>>>> >>
> >>>>> >
> >>>>> >
> >>>>> >
> >>>>> > --
> >>>>> > James Michael DuPont
> >>>>> > Member of Free Libre Open Source Software Kosova 
> >>>>> > http://flossk.org Contributor FOSM, the CC-BY-SA map of the 
> >>>>> > world http://fosm.org Mozilla Rep
> >>>>> > https://reps.mozilla.org/u/h4ck3rm1k3
> >>>>>
> >>>>>
> >>>>>
> >>>>> --
> >>>>> James Michael DuPont
> >>>>> Member of Free Libre Open Source Software Kosova http://flossk.org 
> >>>>> Contributor FOSM, the CC-BY-SA map of the world http://fosm.org 
> >>>>> Mozilla Rep https://reps.mozilla.org/u/h4ck3rm1k3
> >>>>>
> >>>>> _______________________________________________
> >>>>> Wikitech-l mailing list
> >>>>> Wikitech-l@lists.wikimedia.org
> >>>>> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> >>>>>
> >>>>
> >>>>
> >>>>
> >>>> --
> >>>> Regards,
> >>>> Hydriz
> >>>>
> >>>> We've created the greatest collection of shared knowledge in 
> >>>> history. Help protect Wikipedia. Donate now:
> >>>> http://donate.wikimedia.org
> >>>> _______________________________________________
> >>>> Wikitech-l mailing list
> >>>> Wikitech-l@lists.wikimedia.org
> >>>> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> >>>
> >>>
> >>>
> >>> --
> >>> James Michael DuPont
> >>> Member of Free Libre Open Source Software Kosova http://flossk.org 
> >>> Contributor FOSM, the CC-BY-SA map of the world http://fosm.org 
> >>> Mozilla Rep https://reps.mozilla.org/u/h4ck3rm1k3
> >>
> >>
> >>
> >> --
> >> James Michael DuPont
> >> Member of Free Libre Open Source Software Kosova http://flossk.org 
> >> Contributor FOSM, the CC-BY-SA map of the world http://fosm.org 
> >> Mozilla Rep https://reps.mozilla.org/u/h4ck3rm1k3
> >
> >
> >
> > --
> > James Michael DuPont
> > Member of Free Libre Open Source Software Kosova http://flossk.org 
> > Contributor FOSM, the CC-BY-SA map of the world http://fosm.org 
> > Mozilla Rep https://reps.mozilla.org/u/h4ck3rm1k3
> 
> 
> 
> --
> James Michael DuPont
> Member of Free Libre Open Source Software Kosova http://flossk.org 
> Contributor FOSM, the CC-BY-SA map of the world http://fosm.org Mozilla Rep 
> https://reps.mozilla.org/u/h4ck3rm1k3
> 
> _______________________________________________
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> 
> 
> _______________________________________________
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
> 
> 
> _______________________________________________
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l



_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to