Thanks, Lars! this seems to be right source ;) ----- Ursprüngliche Mail ----- Von: "Lars Aronsson" <[email protected]> An: "The Wiktionary (http://www.wiktionary.org) mailing list" <[email protected]> Gesendet: Freitag, 1. Juni 2012 12:08:12 Betreff: Re: [Wiktionary-l] Extracting German noun forms
On 2012-05-31 12:42, Gerd Zechmeister wrote: > I'd like to extract German noun forms (Kasus and Numerus) but didn't find > this data in the provided dumps. > > Example: http://de.wiktionary.org/wiki/Haus > > I need the data from the box: > Kasus Singular Plural > Nominativ das Haus die Häuser This is provided in the wiki template call {{Deutsch Substantiv Übersicht |... |Nominativ Singular=das Haus |Nominativ Plural=die Häuser ... That you find in this XML dump (only 50 MB compressed), http://dumps.wikimedia.org/dewiktionary/20120526/dewiktionary-20120526-pages-articles.xml.bz2 An old Perl script for parsing the XML dumps is found here, http://meta.wikimedia.org/wiki/User:LA2/Extraktor -- Lars Aronsson ([email protected]) Aronsson Datateknik - http://aronsson.se _______________________________________________ Wiktionary-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wiktionary-l _______________________________________________ Wiktionary-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wiktionary-l
