Re: [Wikitech-l] question regarding crawling wikipedia for information

MZMcBride Tue, 10 Sep 2013 23:10:25 -0700

Wenqin Ye wrote:
>If we are creating an ai app that needs to get information , would we be
>allowed to crawl wikipedia for this information? The app would probably be
>a search query of some kind, that give information back to the user, one
>of
>the sites used is wikipedia. The app would use parts of wikipedia's
>articles and send that info back to the user, and give them a link to
>click
>if they want to visit the full article. Each user can only query/search
>once per second; however the collective user base might query wikipedia
>more than once. Therefore, this web crawler may crawl more than once per
>second collectively with every user. Would this be allowed?


Hi.

Depending on your needs, MediaWiki has a robust Web API:

https://www.mediawiki.org/wiki/API:Main_page

The English Wikipedia Web API:

https://en.wikipedia.org/w/api.php

API etiquette is described here:

https://www.mediawiki.org/wiki/API:Etiquette

For larger data sets, you can try the XML or SQL dumps:

http://dumps.wikimedia.org/

Or perhaps a caching layer would make sense.

Hope that helps.

MZMcBride



_______________________________________________
Wikitech-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Re: [Wikitech-l] question regarding crawling wikipedia for information

Reply via email to