[EMAIL PROTECTED] wrote:
>   href="http://www.abcd.com/xyz/lmt/default.asp?bcid=1&lmt=1";> ... </a>

Your document may be well-formed HTML but that doesn't make it
well-formed by XML standards. For example, the sample you've
listed is not well-formed XML; the '&' character must also be
escaped within attribute values. 

In order to parse these documents, you need to "fix up" your 
HTML source. There are a variety of tools you can use for this. 
I would suggest either JTidy[1] or NekoHTML[2]. I prefer 
NekoHTML because it integrates better with Xerces2 and also 
because I wrote it. ;)

[1] http://sourceforge.net/projects/jtidy
[2] http://www.apache.org/~andyc/

-- 
Andy Clark * [EMAIL PROTECTED]

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to