[EMAIL PROTECTED] wrote: > href="http://www.abcd.com/xyz/lmt/default.asp?bcid=1&lmt=1"> ... </a>
Your document may be well-formed HTML but that doesn't make it well-formed by XML standards. For example, the sample you've listed is not well-formed XML; the '&' character must also be escaped within attribute values. In order to parse these documents, you need to "fix up" your HTML source. There are a variety of tools you can use for this. I would suggest either JTidy[1] or NekoHTML[2]. I prefer NekoHTML because it integrates better with Xerces2 and also because I wrote it. ;) [1] http://sourceforge.net/projects/jtidy [2] http://www.apache.org/~andyc/ -- Andy Clark * [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
