Wenqin Ye wrote: >If we are creating an ai app that needs to get information , would we be >allowed to crawl wikipedia for this information? The app would probably be >a search query of some kind, that give information back to the user, one >of >the sites used is wikipedia. The app would use parts of wikipedia's >articles and send that info back to the user, and give them a link to >click >if they want to visit the full article. Each user can only query/search >once per second; however the collective user base might query wikipedia >more than once. Therefore, this web crawler may crawl more than once per >second collectively with every user. Would this be allowed?
Hi. Depending on your needs, MediaWiki has a robust Web API: https://www.mediawiki.org/wiki/API:Main_page The English Wikipedia Web API: https://en.wikipedia.org/w/api.php API etiquette is described here: https://www.mediawiki.org/wiki/API:Etiquette For larger data sets, you can try the XML or SQL dumps: http://dumps.wikimedia.org/ Or perhaps a caching layer would make sense. Hope that helps. MZMcBride _______________________________________________ Wikitech-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikitech-l
