Thanks, Lars! this seems to be right source ;)

----- Ursprüngliche Mail -----
Von: "Lars Aronsson" <[email protected]>
An: "The Wiktionary (http://www.wiktionary.org) mailing list" 
<[email protected]>
Gesendet: Freitag, 1. Juni 2012 12:08:12
Betreff: Re: [Wiktionary-l] Extracting German noun forms

On 2012-05-31 12:42, Gerd Zechmeister wrote:
> I'd like to extract German noun forms (Kasus and Numerus) but didn't find 
> this data in the provided dumps.
>
> Example: http://de.wiktionary.org/wiki/Haus
>
> I need the data from the box:
> Kasus         Singular        Plural
> Nominativ     das Haus        die Häuser

This is provided in the wiki template call

{{Deutsch Substantiv Übersicht
|...
|Nominativ Singular=das Haus
|Nominativ Plural=die Häuser
...

That you find in this XML dump (only 50 MB compressed),
http://dumps.wikimedia.org/dewiktionary/20120526/dewiktionary-20120526-pages-articles.xml.bz2

An old Perl script for parsing the XML dumps is found here,
http://meta.wikimedia.org/wiki/User:LA2/Extraktor


-- 
   Lars Aronsson ([email protected])
   Aronsson Datateknik - http://aronsson.se



_______________________________________________
Wiktionary-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wiktionary-l

_______________________________________________
Wiktionary-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wiktionary-l

Reply via email to