https://doc.wikimedia.org/Parsoid/master/#!/guide/jsapi also gives a nice interface to walk a document structure, including recursing into template arguments & etc. It could be made much faster by fetching content from RESTBase.
Note that links generated by templates are a sort of special case. Do you want only links which appear in the *arguments* to the template? Or do you want links are contained in the template itself? These cases are slightly different. --scott On Wed, Sep 30, 2015 at 9:44 AM, Eric Evans <[email protected]> wrote: > On Wed, Sep 30, 2015 at 3:35 AM, Dimitrov, Dimitar < > [email protected]> wrote: > >> 1. What is the fastest way to get the html of an article for specific >> revision or what is the best tool to setup local copy of Wikipedia >> (currently I am experimenting with Xowa and Wikitaxi). > > > You can use the REST API to fetch article html by revision (see: > https://en.wikipedia.org/api/rest_v1/?doc). > > For example: > https://en.wikipedia.org/api/rest_v1/page/html/Main%20Page/664887982 > > The output this produces is generated by parsoid (see: > https://www.mediawiki.org/wiki/Parsoid/MediaWiki_DOM_spec). > > -- > Eric Evans > [email protected] > _______________________________________________ > Wikitech-l mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/wikitech-l -- (http://cscott.net) _______________________________________________ Wikitech-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikitech-l
