--- El jue, 11/11/10, Diederik van Liere <[email protected]> escribió:

> De: Diederik van Liere <[email protected]>
> Asunto: Re: [Wiki-research-l] Editor Trends Study - Improving the tool
> Para: [email protected]
> Fecha: jueves, 11 de noviembre, 2010 23:44
> Dear Felipe,
> 
> We did investigate other tools before deciding to embark on
> this new
> project, as you rightly point out we should minimize code
> overlap.
> Pywikipediabot is an editing tool as far as I know and your
> tool,
> WikixRay, has definitely proven itself. However, I believe
> that a
> no-sql solution will give better performance than sql
> databases and
> that has been one of the main reasons to write this tool.
> 
> I am not sure if a separate mailing list is required, at
> the moment
> it's not, but thanks for the suggestion and I have added
> the SVN link.
> 

Thanks, Diederik. I'm also curious about testing the performance of MongoDB. I 
admit I've never tried this kind of DBs yet. 

Will check the SVN.

Best,
F.

> Best,
> 
> Diederik
> > To: Research into Wikimedia content and communities
> >        <[email protected]>
> > Message-ID: <[email protected]>
> > Content-Type: text/plain; charset="iso-8859-1"
> >
> >
> >
> > --- El mi?, 10/11/10, Diederik van Liere <[email protected]>
> escribi?:
> >
> > De: Diederik van Liere <[email protected]>
> > Asunto: [Wiki-research-l] Editor Trends Study -
> Improving the tool
> > Para: [email protected]
> > Fecha: mi?rcoles, 10 de noviembre, 2010 00:02
> >
> > Hi, Diederik,
> >
> > I'm also glad to see progress in this project. Some
> comments inline.
> >
> > Dear researchers,
> >
> > Recently, we started the Editor Trends Study 
> > (http://strategy.wikimedia.org/wiki/Editor_Trends_Study).
> > The goal of this study is to get a better
> understanding of the community
> >
> > dynamics within the different Wikipedia projects.
> >
> > Part of this project consists of developing a tool 
> > (http://strategy.wikimedia.org/wiki/Editor_Trends_Study/Software)
> >
> > that parses a Wikipedia dump file, extracts the
> required information, stores it
> > in a database and exports it to a CSV file. This CSV
> file can then be used in a
> > statistical program such as R, Stata or SAS.
> >
> > Well, I would have expected that the team would have
> done some previous search for open source code already
> available, that implements at least some (if not exactly all
> or the very same) of the planned functionalities.
> >
> > Some examples are my own tool, WikiXRay, and
> Pywikpediabot (that, AFAIK, now it also includes a fast
> parser of Wikipedia dump files).
> >
> > For my tool, now I use git for version control and you
> can use any of the two repos available (the official at
> libresoft, or the mirror at Gitorious):
> >
> > http://git.libresoft.es/WikixRay/
> > http://gitorious.org/wikixray/wikixray
> >
> > Well, they might not be the best possible software
> available, but I guess they can help to solve some problems,
> or at least help you to speed up the development and to
> avoid starting from scratch.
> >
> >
> > We are looking for some volunteers that would enjoy
> testing the tool. You don't need to be a
> > software developer (although it helps :)) to help us;
> some patience, a bit of time and
> > a fairly recent computer is all you need. You should
> be comfortable installing programs,
> >
> > working with a command-line interface and have basic
> Subversion experience.
> > Python experience is a real bonus!
> >
> > The testing will focus on getting the tool to run
> without any supervision. For more background information,
> have a look at:
> >
> > http://strategy.wikimedia.org/wiki/Editor_Trends_Study/Software
> >
> > Perhaps you're going to provide this info later, but I
> don't see the links to your SVN repo (only [] ).
> >
> > We are testing the tool with the largest Wikipedia
> projects, so if you would like to replicate
> >
> > the analysis on your own favorite Wikipedia project or
> help improve the quality of the tool then please contact me
> off-list.
> >
> > I think it should be more effective to have another
> public list to which people specifically interested in this
> tool can suscribe (for example, like we have one for XML
> dumps exclusively).
> >
> > This should sensibly reduce the number of duplicated
> bug reports, and comments, since other people can learn
> about known issues.
> >
> > Hope this helps.
> >
> > Best,
> > Felipe.
> >
> > Best,
> >
> > Diederik
> 
> _______________________________________________
> Wiki-research-l mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/wiki-research-l
> 


      

_______________________________________________
Wiki-research-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wiki-research-l

Reply via email to