Hi Your question is to vaguely formulated - please correct it
On Thu, Jan 12, 2012 at 2:37 PM, Sebastian Hellmann < [email protected]> wrote: > Hello all, > is there a query language for wiki syntax? > (NOTE: I really do not mean the Wikipedia API here.) > > I am looking for an easy way to scrape data from Wiki pages. > In this way, we could apply a crowd-sourcing approach to knowledge > extraction from Wikis. > > There must be thousands of data scraping approaches. But is there one > amongst them that has developed a "wiki scraper language" ? > Maybe with some sort of fuzziness involved, if the pages are too messy. > I have not yet worked with the XML transformation of the wiki markup: > > *action=expandtemplates ** > generatexml - Generate XML parse tree > > Is it any good for issuing XPATH queries ? > 1. XPATH reqires XML , mediawiki markup is not XML. 2. the only aplication which (correctly!?) expands templates is MedaiWiki itself. 3. You neglected to explain what you are trying to scrape and what constitutes a messy page. > > Thank you very much, > Sebastian > > -- > Dipl. Inf. Sebastian Hellmann > Department of Computer Science, University of Leipzig > Projects: http://nlp2rdf.org , http://dbpedia.org > Homepage: http://bis.informatik.uni-leipzig.de/SebastianHellmann > Research Group: http://aksw.org > > > _______________________________________________ > Wikitext-l mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/wikitext-l > -- Oren Bochman
_______________________________________________ Wikitext-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikitext-l
