I've ported the OpenXML HTML DOM to work on top of the Xerces DOM. This implementation is currently available in the OpenXML release under the package name org.apache.html.dom. If Xerces is available in the classpath, the OpenXML HTML parser will now use the Xerces HTML DOM rather than the OpenXML HTML DOM.
I've indentified a conflict in Xerces, apparently ElementImpl defines a public method getValue() that conflicts with HTMLLIElement defining the same method with a different return type. I could not find any use for ElementImpl.getValue and commenting it out did not break the build. I would like a confirmation that this fix will not break anything before committing it. Providing HTML functionality to Xerces requires adding three packages: the HTML DOM (org.w3c.dom.html), a Xerces HTML implementation, and the parser (separate code base than the XML parser). This will increase the JAR size by an additional 100KB, it might (or might not) be smart to add it as a separate add-on JAR. Currently I'm using the package name org.apache.html.dom. Is this in line with the proposed org.apache.xml and should the parser reside in org.apache.html.parser? arkin -- ____________________________________________________________ Assaf Arkin [EMAIL PROTECTED] CTO http://www.exoffice.com Exoffice, The ExoLab Company tel: (650) 259-9796
