Hi Jenny. Things are looking a little funny here. First you do this: transformer.setOutputProperty(OutputKeys.METHOD, "html");
but you also do this: transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes"); If you are really getting the html output method then there is no reason to omit the xml declaration. If an element has a name that is recognized as an html element ( HTML, HEAD, BR, STRONG, TEXTAREA ...) and that name is in no namespace, then the element is treated as an HTML element. Looking at the serializer code in Xalan-J 2.7.1 the mininimized form is never emitted for html elements. However one can have XML elements within HTML. An element is treated as an XML element if it is in namespace, or if its name is not one of the recognized html elements. In your case it would appear that either the output method is XML, and all elements are treated as XML, hence the minimized form, or even though these elements have the right names, they are not treated as HTML because they are in a default namespace. I'm leaning towards the possibility that the effective output method is xml. What does happen when you don't ask for the xml header to be omitted? Does it come out? I think you've created an identity transformation to do you serialization and it is too late to set the output method. Try setting the output method on the object that you get from TransformerFactory.newInstance(), before you call newTransformer() on it. - Brian "Jenny Brown" <[EMAIL PROTECTED] m> To xalan-j-users@xml.apache.org 04/16/2008 09:27 cc PM Subject Trouble exporting HTML from a DOM in memory I have an html DOM tree in memory (after having passed html through JTidy and NekoHTML for validation/cleanup) and I'm trying to write it back out as valid html. I'm using Xerces 2.9.1 and Xalan 2.7.1 with Sun JDK 1.5.0_14. I'm running this command line, so I have careful control of the classpath. The jars in my project are very minimal but I wouldn't rule out conflicts with the JDK yet (though I'm not sure how to check that). The specific examples I'm having trouble with follow, as well as the code I'm using to do the export. The main situation I'm having trouble with is empty tags. For instance... my input file contains: <P>This is some <STRONG></STRONG> paragraph text.</P> <P>This is a textarea. <TEXTAREA name="foo"></TEXTAREA> It has text after it.</P> It gets into my in-memory dom tree okay. But then when I try to use a transformer to output the html, instead I get this which Firefox chokes on: <P>This is some <STRONG/> paragraph text.</P> <P>This is a textarea. <TEXTAREA name="foo"/> It has text after it.</P> (Firefox sees <STRONG/> and thinks it means <STRONG> and sees <TEXTAREA/> and thinks it means <TEXTAREA> ... which leaves the tags hanging open and they boldface or otherwise consume the rest of the page; on other tags such as div it may even make the whole page un-renderable.) So here's what I'm doing for export code, and my intention is simply to produce valid HTML that a browser can render later. ============ Transformer transformer = TransformerFactory.newInstance().newTransformer(); transformer.setOutputProperty(OutputKeys.METHOD, "html"); transformer.setOutputProperty(OutputKeys.MEDIA_TYPE, "text/html"); transformer.setOutputProperty(OutputKeys.ENCODING, "UTF-8"); transformer.setOutputProperty(OutputKeys.OMIT_XML_DECLARATION, "yes"); StringWriter sw = new StringWriter(); try { transformer.transform(new DOMSource(domDocument), new StreamResult(sw)); } catch (TransformerException te) { return(te.toString()); } ============ (Yes, I do really actually want it in a string after that, not an output stream... this will eventually be a module in the middle of a handling pipeline) So, I'm trying to tell it to give me html, but what I get is a document that contains xml-like empty tags wherever the tag was empty, which results in browser bombs, and starts with: <HTML xmlns="http://www.w3.org/1999/xhtml" lang="en"> I'm sure there's something I'm missing here (configuration? other setup?), but I'm not sure what. Thanks for your help. Jenny Brown