I once had pretty good success parsing some sloppy HTML right off the web through an HTTP proxy server with a parser called neko. I can provide code samples off-list if you need them. It is also an apache offering. Timothy Jones
Syniverse Technologies Work (813) 637-5366 Sr. Systems Engineer Cell (813) 857-7650 Development, Tampa, FL ________________________________ From: Dave Brosius [mailto:[EMAIL PROTECTED] Sent: Tuesday, August 21, 2007 9:37 AM To: Michael Bauer Cc: xalan-j-users@xml.apache.org Subject: Re: Ignoring errors No, but there are various html 'tidying' tools that you could use to preparse the html before passing to the transformer. Michael Bauer <[EMAIL PROTECTED]> 08/21/2007 09:33 AM To xalan-j-users@xml.apache.org cc Subject Ignoring errors I am using Xalan/Xerces to parse out some data from a web page. The problem is that the web page is not well-formed, and running the Transformer on it produces: ERROR: 'Open quote is expected for attribute "href|".' ERROR: 'com.sun.org.apache.xml.internal.utils.WrappedRuntimeException: Open quote is expected for attribute "href|".' Is there anyway to instruct the Parse/Transformer to ignore such errors?