If you are just trying to get at the structure from the various dump files, the page table has page ids, titles, and whether the page is a redirect or not (*-page.sql.gz), the category table has category names, ids, and summary information (*-category.sql.gz), and categorylinks has the list of all category links in a page, with the page id and the category name (*-categorylinks.sql.gz). You can find details on the tables here: http://www.mediawiki.org/wiki/Manual:Categorylinks_table (here's the category: http://www.mediawiki.org/wiki/Category:MediaWiki_database_tables )
Hopefully this should get you started. Ariel Στις 09-01-2013, ημέρα Τετ, και ώρα 10:51 -0800, ο/η Robert Crowe έγραψε: > I'd like to mirror just the category structure of the English > Wikipedia, and I'm wondering which of the dump files I need to start > with. > > > > I don't need the page content, just the page names, and only for the > most current revision. I need the categories and category members, > and I'd like to exclude hidden categories. I also need to distinguish > redirects, because I don't want to treat them as separate pages. As > much as possible I'd like to work with SQL files, but I can crunch > through XML if necessary. > > > > So which files do I need to download? I may also need some help in > understanding the schemas. > > > > Thanks, > > > > Robert > > > > > _______________________________________________ > Xmldatadumps-l mailing list > Xmldatadumps-l@lists.wikimedia.org > https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l _______________________________________________ Xmldatadumps-l mailing list Xmldatadumps-l@lists.wikimedia.org https://lists.wikimedia.org/mailman/listinfo/xmldatadumps-l