On 2012-05-31 12:42, Gerd Zechmeister wrote:
I'd like to extract German noun forms (Kasus and Numerus) but didn't find this 
data in the provided dumps.

Example: http://de.wiktionary.org/wiki/Haus

I need the data from the box:
Kasus   Singular        Plural
Nominativ       das Haus        die Häuser

This is provided in the wiki template call

{{Deutsch Substantiv Übersicht
|...
|Nominativ Singular=das Haus
|Nominativ Plural=die Häuser
...

That you find in this XML dump (only 50 MB compressed),
http://dumps.wikimedia.org/dewiktionary/20120526/dewiktionary-20120526-pages-articles.xml.bz2

An old Perl script for parsing the XML dumps is found here,
http://meta.wikimedia.org/wiki/User:LA2/Extraktor


--
  Lars Aronsson ([email protected])
  Aronsson Datateknik - http://aronsson.se



_______________________________________________
Wiktionary-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wiktionary-l

Reply via email to