Not sure I want to throw the API open to the public (the grok.se folks, and others, have a fine service for casual experimentation).
However, I am willing to share the data with interested researchers who need to do some serious crunching (I have a Java API and could distribute database credentials on a per-case basis). I'll note that I only parse English Wikipedia at this time. I've found it useful in my anti-vandalism research (i.e., "given that edit survived between time [w] and [x] on article [y], we estimate it received [z] views"). Thanks, -AW On 04/06/2011 08:44 PM, MZMcBride wrote: > Andrew G. West wrote: >> I've parsed every one of these files (at hour granularity; grok.se >> aggregates at day-level, I believe) since Jan. 2010 into a DB structure >> indexed by page title. It takes up about 400GB of space, at the moment. > > Is your database available to the public? The Toolserver folks have been > talking about getting the page view stats into usable form for quite some > time, but nothing's happened yet. If you have an API or something similar, > that would be fantastic. (stats.grok.se has a rudimentary API that I don't > imagine many people are aware of.) > > MZMcBride > > > > _______________________________________________ > Wikitech-l mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/wikitech-l -- Andrew G. West, Doctoral Student Dept. of Computer and Information Science University of Pennsylvania, Philadelphia PA Website: http://www.cis.upenn.edu/~westand _______________________________________________ Wikitech-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikitech-l
