Andy's NekoHTML parser has worked well for me in a small project where I needed to scrape some data from a set of HTML pages. With NekoHTML as the front end I was able to use an XSLT stylesheet to extract that data directly from the HTML pages.
NekoHTML also allowed me to write a simple HTML transformation that I find useful when analyzing HTML page layouts: adding a small colored border to each TABLE so that the table boundaries are visible. This transformation requires only a few lines of XSLT added to a standard "identity" transformation. I expect that NekoHTML would make it easy to translate HTML code into XHTML format. I have encountered a few tag-balancing glitches, where NekoHTML struggles to accommodate ill-formed HTML code much as the popular browsers do, but overall it has been very solid. NekoHTML is very easy to use. For the most part it is a transparent addition to a standard Xerces/Xalan configuration, and all the usual APIs -- including JAXP -- seem to work as expected. Nice work Andy. Thank you for making NekoHTML available. -- Fred Yankowski [EMAIL PROTECTED] tel: +1.630.879.1312 OntoSys, Inc PGP keyID: 7B449345 fax: +1.630.879.1370 www.ontosys.com 38W242 Deerpath Rd, Batavia, IL 60510-9461, USA --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
