DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT <http://nagoya.apache.org/bugzilla/show_bug.cgi?id=24278>. ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND INSERTED IN THE BUG DATABASE.
http://nagoya.apache.org/bugzilla/show_bug.cgi?id=24278 Incorrect SAXException about bad integral value of a character to be written out. ------- Additional Comments From [EMAIL PROTECTED] 2003-11-12 21:54 ------- Sergey, Xalan-J used an internal utility serializer, ToTextStream with no encoding set in 2.5.1, and nothing has changed about that in 2.5.2. The encoding was null in 2.5.1 and is still null in 2.5.1. This utility serializer is used internally to turn things into strings. The stylesheet which I previously attached for this bug is: -------------------------------------------- <?xml version="1.0" encoding="UTF-8"?> <xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform" version="1.0"> <xsl:output method="xml" encoding="UTF-8" /> <xsl:template match="doc"> <out> <xsl:attribute name="bgcolor" > <xsl:value-of select="foo"/> </xsl:attribute> </out> </xsl:template> </xsl:stylesheet> ------------------------------------------------------ Xalan is trying to evaluate <xsl:value-of select="foo"/> and is using an internal utility ToTextStream to evaluate that into a string. The way that Xalan creates its utility ToTextStream serializer causes it to not have an encoding. It does seem a little strange, but this isn't the final serialization step, so the encoding doesn't apply anyway. The encoding="UTF-8" in this testcase is for the serialization to XML done at the end of the processing, and will be done with a ToXMLStream serializer. So there isn't a problem to be fixed regarding the ToTextStream serializer not having an encoding. What did happen between 2.5.1 and 2.5.2 is that I applied a patch to fix bug 795 which cause these problems (sorry about that!). I thought my first patch to this defect would be OK, but thanks to Chris Trautwein for finding that it wasn't. My second patch in this bug is for ToTextStream to emit an error message only if the encoding is set and the output character doesn't fit in that encoding. So for ToTextStream internal utility serializers used by Xalan, which have no encoding set, there would be no change in behavior from 2.5.1 after that patch is applied. There should be a change in behavior if you are serializing to text and a character is out of range because that ToTextStream serializer does have an encoding set. The second patch is still the right one. We know why the utility serializer has no encoding set, but setting one for it would probably be wrong because character entity processing could incorrectly done twice rather than once. If with the patch you have a testcase with an illegal character in attribute value that used to get an error message in 2.5.1 but doesn't anymore please attach the testcase. Regards, Brian Minchau
