There was lots of discussion on this, but not really much of a
conclusion. We're currently using Tidy as a preprocessor for nasty
random HTML pages, but it seems to be overkill. It does lots of stuff so
we get perfect HTML out the other end, rather than just creating
something that's well formed.

I had a look at OpenXML, but it's Java and we need something in C or
C++. I can't seem to find the IBM parser that was mentioned. Does anyone
have further pointers they can give me?

Cheers,
Mike.

-- 
Mike Mason, Software Engineer
XML Script Development Team                    Office: 44-1865-203192
http://www.xmlscript.org/                      Mobile: 44-7050-288923

Reply via email to