Not sure I want to throw the API open to the public (the grok.se folks, 
and others, have a fine service for casual experimentation).

However, I am willing to share the data with interested researchers who 
need to do some serious crunching (I have a Java API and could 
distribute database credentials on a per-case basis).

I'll note that I only parse English Wikipedia at this time. I've found 
it useful in my anti-vandalism research (i.e., "given that edit survived 
between time [w] and [x] on article [y], we estimate it received [z] 
views"). Thanks, -AW


On 04/06/2011 08:44 PM, MZMcBride wrote:
> Andrew G. West wrote:
>> I've parsed every one of these files (at hour granularity; grok.se
>> aggregates at day-level, I believe) since Jan. 2010 into a DB structure
>> indexed by page title. It takes up about 400GB of space, at the moment.
>
> Is your database available to the public? The Toolserver folks have been
> talking about getting the page view stats into usable form for quite some
> time, but nothing's happened yet. If you have an API or something similar,
> that would be fantastic. (stats.grok.se has a rudimentary API that I don't
> imagine many people are aware of.)
>
> MZMcBride
>
>
>
> _______________________________________________
> Wikitech-l mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l

-- 
Andrew G. West, Doctoral Student
Dept. of Computer and Information Science
University of Pennsylvania, Philadelphia PA
Website: http://www.cis.upenn.edu/~westand

_______________________________________________
Wikitech-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to