DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://nagoya.apache.org/bugzilla/show_bug.cgi?id=24278>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=24278

Incorrect SAXException about bad integral value of a character to be written out.





------- Additional Comments From [EMAIL PROTECTED]  2003-11-12 21:54 -------
Sergey,
Xalan-J used an internal utility serializer, ToTextStream with no encoding set 
in 2.5.1, and nothing has changed about that in 2.5.2. The encoding was null in 
2.5.1 and is still null in 2.5.1.  This utility serializer is used internally 
to turn things into strings.

The stylesheet which I previously attached for this bug is:
--------------------------------------------
<?xml version="1.0" encoding="UTF-8"?>
<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"; version="1.0">
<xsl:output method="xml" encoding="UTF-8" />

<xsl:template match="doc">
<out>
     <xsl:attribute name="bgcolor" >  
        <xsl:value-of select="foo"/>
      </xsl:attribute>
</out>
</xsl:template>

</xsl:stylesheet>
------------------------------------------------------

Xalan is trying to evaluate <xsl:value-of select="foo"/> and is using an 
internal utility ToTextStream to evaluate that into a string.

The way that Xalan creates its utility ToTextStream serializer causes it to not 
have an encoding.  It does seem a little strange, but this isn't the final 
serialization step, so the encoding doesn't apply anyway.  The encoding="UTF-8" 
in this testcase is for the serialization to XML done at the end of the 
processing, and will be done with a ToXMLStream serializer. 

So there isn't a problem to be fixed regarding the ToTextStream serializer not 
having an encoding.  What did happen between 2.5.1 and 2.5.2 is that I applied 
a patch to fix bug 795 which cause these problems (sorry about that!).  I 
thought my first patch to this defect would be OK, but thanks to Chris 
Trautwein for finding that it wasn't.  My second patch in this bug is for 
ToTextStream to emit an error message only if the encoding is set and the 
output character doesn't fit in that encoding.  So for ToTextStream internal 
utility serializers used by Xalan, which have no encoding set, there would be 
no change in behavior from 2.5.1 after that patch is applied.

There should be a change in behavior if you are serializing to text and a 
character is out of range because that ToTextStream serializer does have an 
encoding set.

The second patch is still the right one.  We know why the utility serializer 
has no encoding set, but setting one for it would probably be wrong because 
character entity processing could incorrectly done twice rather than once.

If with the patch you have a testcase with an illegal character in attribute 
value that used to get an error message in 2.5.1 but doesn't anymore please 
attach the testcase.
Regards,
Brian Minchau

Reply via email to