I am so glad that someone re-re-resurrects this topic :-)

On Fri, Oct 23, 2009 at 1:27 PM, Andrew Dunbar <[email protected]> wrote:
> I've been spending hours on the parsing now and don't find it simple
> at all due to the fact that templates can be nested. Just extracting
> the Infobox as one big lump is hard due to the need to match nested {{
> and }}

Not perfect, but try
http://toolserver.org/~magnus/wiki2xml/w2x.php

1. Unckeck "Use API", chose "Do not use templates"
2. Enter article name(s)
3. Get XML
4. Parse XML, re-submit the wiki text in templates to process the next
level of templates

I should really offer #4 in this...

Caveat: Will break on things like HTML attributes that are filled by
templates etc.

Cheers,
Magnus

_______________________________________________
Wikitech-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to