Hi

Your question is to vaguely formulated - please correct it

On Thu, Jan 12, 2012 at 2:37 PM, Sebastian Hellmann <
[email protected]> wrote:

> Hello all,
> is there a query language for wiki syntax?
> (NOTE: I really do not mean the Wikipedia API here.)
>
> I am looking for an easy way to scrape data from Wiki pages.
> In this way, we could apply a crowd-sourcing approach to knowledge
> extraction from Wikis.
>
> There must be thousands of data scraping approaches. But is there one
> amongst them that has developed a "wiki scraper language" ?
> Maybe with some sort of fuzziness involved, if the pages are too messy.
> I have not yet worked with the XML transformation of the wiki markup:
>
> *action=expandtemplates **
>   generatexml         - Generate XML parse tree
>
> Is it any good for issuing XPATH queries ?
>

1. XPATH reqires XML , mediawiki markup is not XML.
2. the only aplication which (correctly!?) expands templates is MedaiWiki
itself.
3. You neglected to explain what you are trying to scrape and what
constitutes a messy page.


>
> Thank you very much,
> Sebastian
>
> --
> Dipl. Inf. Sebastian Hellmann
> Department of Computer Science, University of Leipzig
> Projects: http://nlp2rdf.org , http://dbpedia.org
> Homepage: http://bis.informatik.uni-leipzig.de/SebastianHellmann
> Research Group: http://aksw.org
>
>
> _______________________________________________
> Wikitext-l mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/wikitext-l
>



-- 

Oren Bochman
_______________________________________________
Wikitext-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikitext-l

Reply via email to