On Tue, Jul 9, 2013 at 10:46 AM, Daniel Mietchen
<[email protected]> wrote:
> Hello together,
>
> in the framework of a GLAM project, we are looking for ways to
> (1) identify the number of pages in a given category - including via
> subcategories - on a given wiki

You can get the list of subcategories of a category with
list=categorymembers&cmtype=subcat. You'd have to make calls to this
for each individual (sub)category you're interested in, and be sure to
detect cycles properly.

You can get the number of pages in a category with prop=categoryinfo.
You can batch this by specifying up to 50 titles per query (500 if
your account has the "apihighlimits" userright).

If you're going to be doing a lot of this, it might be better to
perform queries directly against the database, either by downloading
the database dumps or using Tool Labs.

> (2) get the pageview stats for all these pages, including on aggregate

The raw pageview stat data may also be available on Tool Labs. I see
some data in /shared/viewstats/, but it doesn't seem to be up to date.


-- 
Brad Jorsch (Anomie)
Software Engineer
Wikimedia Foundation

_______________________________________________
Wikitech-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to