I am so glad that someone re-re-resurrects this topic :-)
On Fri, Oct 23, 2009 at 1:27 PM, Andrew Dunbar <[email protected]> wrote: > I've been spending hours on the parsing now and don't find it simple > at all due to the fact that templates can be nested. Just extracting > the Infobox as one big lump is hard due to the need to match nested {{ > and }} Not perfect, but try http://toolserver.org/~magnus/wiki2xml/w2x.php 1. Unckeck "Use API", chose "Do not use templates" 2. Enter article name(s) 3. Get XML 4. Parse XML, re-submit the wiki text in templates to process the next level of templates I should really offer #4 in this... Caveat: Will break on things like HTML attributes that are filled by templates etc. Cheers, Magnus _______________________________________________ Wikitech-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikitech-l
