Hi João, Given you're after mostly "metadata" of revisions and users, and not article text, I suggest you take a look at the mediawiki-history <https://dumps.wikimedia.org/other/mediawiki_history/readme.html> dumps. Those dumps are split by year-month for bery big projects, by year for big ones, and not split for smaller projects. For ptwiki <https://dumps.wikimedia.org/other/mediawiki_history/2022-01/ptwiki/> the data is split by year, so you could download only the years you're after. Also, the data is pretty rich in term of available information per row, and there are chances it could help you :) The data schema being not simple, don't hesitate if you need help! Best Joseph
On Tue, Feb 8, 2022 at 1:26 PM <[email protected]> wrote: > Hi there. > > I am writing to see if someone could advise on a strategy to build a list > of editors on Wikipedia in Portuguese from 2001 to 2006. I am working on a > piece to understand how the Wikimedia community came to be in the > Portuguese-speaking world and therefore want to explore incentives for > contributions before a stronger sense of community and shared identity > existed. This research is led by Flávia Varella (Theory of History Wiki > initiative, Federal University of Santa Catarina) and myself (Wiki > Movimento Brasil, Cásper Líbero School of Journalism). > > Ideally, we would like to rank the list of editors based on their edit > numbers. If possible, we would gather other variables to have a sense of > contributions back then, i.e., bytes added on main space, talk pages etc. > But this appears not to be so easy. > > One strategy that was suggested to me was to look at the ptwiki dump. But > the dump is huge and would require major computational capacity to identify > the period we are interested in. One way out would be to find a place with > a dump from 2006: is this something that exists? Who might have access to > older dumps? What other way(s) might exist to find the data I am looking > for? > > I appreciate any help! > > Cheers, > > João (User:Joalpe) > _______________________________________________ > Wiki-research-l mailing list -- [email protected] > To unsubscribe send an email to [email protected] > -- Joseph Allemandou (joal) (he / him) Staff Data Engineer Wikimedia Foundation _______________________________________________ Wiki-research-l mailing list -- [email protected] To unsubscribe send an email to [email protected]
