Just tuning in briefly into Wikia: > Part of the reason may be that they don't offer regular data dumps. > But WikiTeam has remedied and recovered dumps for most of their top > 14k wikis (as well as all images): > https://archive.org/details/wikia_dump_20140125 > https://archive.org/search.php?query=wikia_dump > > It's possible to release updates if needed, just tell us with some > advance because it takes weeks or months due to aggressive throttling
> and blocking policies. ==> Thanks for that great piece of work. Are there also data dumps for a random sample (e.g. N = 1000) of wikis available? Thanks for the info, Michael >>> <[email protected]> 5/29/2014 8:00 am >>> Send Wiki-research-l mailing list submissions to [email protected] To subscribe or unsubscribe via the World Wide Web, visit https://lists.wikimedia.org/mailman/listinfo/wiki-research-l or, via email, send a message with subject or body 'help' to [email protected] You can reach the person managing the list at [email protected] When replying, please edit your Subject line so it is more specific than "Re: Contents of Wiki-research-l digest..." Today's Topics: 1. Any studies on vandalism levels at Wikia? (Piotr Konieczny) 2. Re: Any studies on vandalism levels at Wikia? (Federico Leva (Nemo)) 3. Re: Any studies on vandalism levels at Wikia? (Piotr Konieczny) 4. Re: Any studies on vandalism levels at Wikia? (Federico Leva (Nemo)) ---------------------------------------------------------------------- Message: 1 Date: Thu, 29 May 2014 12:56:25 +0900 From: Piotr Konieczny <[email protected]> To: Research into Wikimedia content and communities <[email protected]> Cc: [email protected] Subject: [Wiki-research-l] Any studies on vandalism levels at Wikia? Message-ID: <[email protected]> Content-Type: text/plain; charset=ISO-8859-1; format=flowed I wanted to cite a statistic on whether vandalism at Wikia is higher or lower than on Wikipedia, but couldn't find anything. Is anyone familiar with research that I may want to check out? I am drawing almost nothing for studies of Wikia, outside the recent paper by Aaron Shaw and Benjamin Mako Hill (CC-ed), which did not however focus on vandalism. Wikia (the largest wiki farm?) appears to be drastically under-researched... -- Piotr Konieczny, PhD http://hanyang.academia.edu/PiotrKonieczny http://scholar.google.com/citations?user=gdV8_AEAAAAJ http://en.wikipedia.org/wiki/User:Piotrus ------------------------------ Message: 2 Date: Thu, 29 May 2014 08:40:16 +0200 From: "Federico Leva (Nemo)" <[email protected]> To: Research into Wikimedia content and communities <[email protected]> Cc: [email protected] Subject: Re: [Wiki-research-l] Any studies on vandalism levels at Wikia? Message-ID: <[email protected]> Content-Type: text/plain; charset=UTF-8; format=flowed Piotr Konieczny, 29/05/2014 05:56: > Wikia (the largest wiki farm?) appears to be drastically > under-researched... Part of the reason may be that they don't offer regular data dumps. But WikiTeam has remedied and recovered dumps for most of their top 14k wikis (as well as all images): https://archive.org/details/wikia_dump_20140125 https://archive.org/search.php?query=wikia_dump It's possible to release updates if needed, just tell us with some advance because it takes weeks or months due to aggressive throttling and blocking policies. Nemo ------------------------------ Message: 3 Date: Thu, 29 May 2014 19:22:45 +0900 From: Piotr Konieczny <[email protected]> To: Research into Wikimedia content and communities <[email protected]> Subject: Re: [Wiki-research-l] Any studies on vandalism levels at Wikia? Message-ID: <[email protected]> Content-Type: text/plain; charset=UTF-8; format=flowed That's intriguing, any idea why Wikia is being so unfriendly with that? Are they doing the usual corporation "our data is ours/secrecy is good/we don't need your research as it may reveal things we don't want the world/competitors to know about" shtick? -- Piotr Konieczny, PhD http://hanyang.academia.edu/PiotrKonieczny http://scholar.google.com/citations?user=gdV8_AEAAAAJ http://en.wikipedia.org/wiki/User:Piotrus On 5/29/2014 15:40, Federico Leva (Nemo) wrote: > Piotr Konieczny, 29/05/2014 05:56: >> Wikia (the largest wiki farm?) appears to be drastically >> under-researched... > > Part of the reason may be that they don't offer regular data dumps. > But WikiTeam has remedied and recovered dumps for most of their top > 14k wikis (as well as all images): > https://archive.org/details/wikia_dump_20140125 > https://archive.org/search.php?query=wikia_dump > > It's possible to release updates if needed, just tell us with some > advance because it takes weeks or months due to aggressive throttling > and blocking policies. > > Nemo > > _______________________________________________ > Wiki-research-l mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/wiki-research-l ------------------------------ Message: 4 Date: Thu, 29 May 2014 12:55:48 +0200 From: "Federico Leva (Nemo)" <[email protected]> To: Research into Wikimedia content and communities <[email protected]> Subject: Re: [Wiki-research-l] Any studies on vandalism levels at Wikia? Message-ID: <[email protected]> Content-Type: text/plain; charset=UTF-8; format=flowed Piotr Konieczny, 29/05/2014 12:22: > That's intriguing, any idea why Wikia is being so unfriendly with that? > Are they doing the usual corporation "our data is ours/secrecy is > good/we don't need your research as it may reveal things we don't want > the world/competitors to know about" shtick? Nothing like that: they consistently reply that dumps are wonderful and in their opinion they do all they should. http://archiveteam.org/index.php?title=Wikia#Download When you explain them that it's not enough, they don't disagree, but passive-aggresively refer to someone else in the chain of command (I think I covered it all by now). Their current excuse is that they're not sure they have enough disk space on http://s3.amazonaws.com/wikia_xml_dumps/* Nemo ------------------------------ _______________________________________________ Wiki-research-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wiki-research-l End of Wiki-research-l Digest, Vol 105, Issue 26 ************************************************ Zentrum für Europäische Wirtschaftsforschung GmbH (ZEW), L7,1 68161 Mannheim Sitz der Gesellschaft: Mannheim - Amtsgericht Mannheim HRB 6554 Aufsichtsratsvorsitzende: Ministerin Theresia Bauer MdL - Geschaeftsfuehrer: Prof. Dr. Clemens Fuest, Thomas Kohl Centre for European Economic Research L7,1 68161 Mannheim Germany Seat of the Company: Mannheim - Local Court Mannheim HRB 6554 Chairwoman of the Supervisory Board: Minister Theresia Bauer MdL - Executive Directors: Prof. Dr. Clemens Fuest, Thomas Kohl --------------------------------------------------------------------------------
_______________________________________________ Wiki-research-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
