The problem is, HTML is not an XML-based language, so unless you've deliberately written your input document as XHTML, odds are that no XML parser will accept it.
There are HTML parsers available which produce SAX or DOM (XML) output. You could get one of those, use it to read the input document, and route its output to Xalan for processing. Or you could look for a tool which rewrites HTML as XHTML. I believe the W3C's "tidy" tool can be configured to do that. Then you'd run the resulting XHTML document (which _is_ XML) through Xalan. ______________________________________ "You build world of steel and stone I build worlds of words alone Skilled tradespeople, long years taught: You shape matter; I shape thought." (http://www.songworm.com/lyrics/songworm-parody/ShapesofShadow.html)