DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://nagoya.apache.org/bugzilla/show_bug.cgi?id=6328>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.

http://nagoya.apache.org/bugzilla/show_bug.cgi?id=6328

normalize-space deletes space chars within strings / chunking problem

           Summary: normalize-space deletes space chars within strings /
                    chunking problem
           Product: XalanJ2
           Version: 2.2.0
          Platform: All
        OS/Version: All
            Status: NEW
          Severity: Critical
          Priority: Other
         Component: org.apache.xalan.serialize
        AssignedTo: [EMAIL PROTECTED]
        ReportedBy: [EMAIL PROTECTED]


In some cases if a string like "This is a string" is serialized to the output 
like in <xsl:value-of disable-output-escaping="yes" select="normalize-space
(.)"/>, Xalan removed some spaces => Output is e.g. "This isa string".

This happens if xalan chunks strings of the xml input and a space is by chance 
the last char of the chunked char array. I debugged the case until 
class "org.apache.xalan.serialize.SerializerToHTML.java", method "public void 
characters(char chars[], int start, int length)", but unfortunately  I 
currently don't have the time to go down to the bug source. I hope the 
description is sufficient to detect the problem.

I provide an imagined example, because our real case is too complex. But the 
problem itself can clearly be indicated nevertheless.

Let the string "This is a string" serve as example. It can happen that the 
method 'characters' is first invoked with a char array where the last character 
indices are the beginning of the example string: [...,T,h,i,s, ,i,s, ]. Please 
notice that the last char is a space char. The variable start of 
method 'characters' points to letter 'T', the variable length is long enough to 
include letter 's' from 'is', but cuts the array before the space at the end. 
On the next invocation of the method, the beginning of the array is now 
[a, ,s,t,r,i,n,g,...]. Both chunks are combined in the output, the space 
between 'is' and 'a' is missing. 

If the same xsl is executed without normalize-space, the space isn't forgotten, 
it is part of the output.

It seems as if normalize-space is executed on the chunks and not on the 
original strings of the xml input.

As the problem is not deterministic because of the non-deterministic input, it 
can arise unforeseeable to anyone, therefore I call this problem critical.

Best Regards,
Christian Elsen

Reply via email to