> Why do you need to access the live wikipedia for this?
> Using categorylinks.sql and page.sql you should be able to fetch the
> same data. Probably faster.

In my research, the answer to this question is two-fold

A) Creating a local copy of wikipedia (using mediawiki and various
import tools) is quite a process, and requires a significant
investment of time and research unto itself.

B) A few months ago, I pulled 333 semi-random articles from the live
API -- of those, 329 of them have significant enough changes since
20100312 dump (which was the newest dump at the time). A new check
against the 20110115 dump has similar percentage.

Caveat -- my research is largely centered around the infobox template
usage, which is a relatively new deal, so they are generally being
updated frequently.

-- James

_______________________________________________
Wikitech-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to