I have a need to parse fairly simple HTML in my app, but the HTML can be
strictly controlled because it's all going to be made by our own
designers and blessed before being introduced into the product.

For example, I can disallow any tags that are hard to handle and
maintain the strict quality of the code (no unbalanced tags, no leaving
the quotes off attributes and other things that regular HTML parsers are
forced to deal with).

So I have the idea that using XHTML as the source text will work well
and I can parse it with the Xerces library that I've already integrated
into my application.  (I might have to preprocess the WYSISWYG web
authoring tools HTML into XHTML that but could even be done by hand in
my case).

But I need something to interpret the tags and convert them into layout
of the text and graphics in my app.  What I'd like is something that I
can pass a DOM tree too (SAX would be OK too) and it would ask my GUI
library questions like "what is the width and height of this text" or
character, or graphic and so on, and then there'd be some way I could
link that up to layout of my window.

Again in my case I only need really, really simple stuff.  So I could
even be a test case for an incomplete API as long as what was
implemented could be gotten to work reliably.

Is there something like this available?  Or would it be hard to write
what I need given I already have Xerces working and I can choose to use
XHTML?

Mike Crawford
GoingWare - Expert Software Development and Consulting
http://www.goingware.com
[EMAIL PROTECTED]

Reply via email to