priyank bagrecha wrote:
> I am trying to pull up a list of pageids from the snapshot, which belong to
> a specific category. Basically I am trying to pull up pages which are on
> book portal pages. I looked at the snapshot mysql db to see which tables i
> can use. but the fields of tables category and category_links didnt make any
> sense to me in regard to what they stand for. so i was wondering if somebody
> could help me with the sql.

Example category: "Mathematics"
Example namespace: "Portal" (100)

SELECT
  page_namespace,
  page_title
FROM page
JOIN categorylinks
ON cl_from = page_id
WHERE cl_to = 'Mathematics'
AND page_namespace = 100
AND page_is_redirect = 0 /* optional */;

* http://www.mediawiki.org/wiki/Manual:Page_table
* http://www.mediawiki.org/wiki/Manual:Categorylinks_table
* http://www.mediawiki.org/wiki/Manual:Category_table

You don't really need the category table in this case because you just want
relationships (links). If you wanted meta-data about particular categories,
you could (left) join against the category table on cat_name = cl_to.

In order to figure out which namespace ID (integer) corresponds to your
target namespace, you can check the wiki configuration (LocalSettings.php
and DefaultSettings.php) or the wiki's API.

MZMcBride



_______________________________________________
Wikitech-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to