[ 
http://issues.apache.org/jira/browse/XERCESC-1361?page=comments#action_60089 ]
     
Michael Glavassevich commented on XERCESC-1361:
-----------------------------------------------

What you're describing is end of line handling [1]. This behaviour is expected 
and required by the spec. XML parsers must translate CR LF to LF (and also CR 
to LF).

[1] http://www.w3.org/TR/2004/REC-xml-20040204/#sec-line-ends

> CRLF is translated to LF in scanCharData
> ----------------------------------------
>
>          Key: XERCESC-1361
>          URL: http://issues.apache.org/jira/browse/XERCESC-1361
>      Project: Xerces-C++
>         Type: Bug
>   Components: SAX/SAX2
>     Versions: 2.6.0
>  Environment: win2k, Xerces-c 2.6(build the src with vc6+sp5) and Xerces-c 
> 2.1 binary version
>     Reporter: ding hua

>
>  When i parse a simple xml document. there is a CRLF between aaa and bbb. But 
> saxparse call method characters, the string is translated to aaa LF bbb. It 
> loses the char CR.
> <?xml version="1.0" encoding="gb2312" standalone="no"?>
> <dd><ddrow><text>aaa
> bbb</text>
> </ddrow></dd>
>  And i trace the code, i find the char is eated up by handleEOL. I want keep 
> the content unchanged. Is it reasonable? Thanks.
> The call stack
> xercesc_2_6::XMLReader::handleEOL(unsigned short & 0x000d, unsigned char 
> 0x00) line 898
> xercesc_2_6::XMLReader::getNextCharIfNot(const unsigned short 0x003c, 
> unsigned short & 0x000d) line 789
> xercesc_2_6::ReaderMgr::getNextCharIfNot(const unsigned short 0x003c, 
> unsigned short & 0x000d) line 398
> xercesc_2_6::IGXMLScanner::scanCharData(xercesc_2_6::XMLBuffer & {...}) line 
> 2630 + 17 bytes
> xercesc_2_6::IGXMLScanner::scanContent() line 837
> xercesc_2_6::IGXMLScanner::scanDocument(const xercesc_2_6::InputSource & 
> {...}) line 204 + 8 bytes
> xercesc_2_6::SAXParser::parse(const xercesc_2_6::InputSource & {...}) line 720
> internal\XMLReader.hpp Ln895
>                 if ( fCharBuf[fCharIndex] == chLF              || 
>                     ((fCharBuf[fCharIndex] == chNEL) && fNEL)  )
>                 {
>                     fCharIndex++;
>                 }

-- 
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
   http://issues.apache.org/jira/secure/Administrators.jspa
-
If you want more information on JIRA, or have a bug to report see:
   http://www.atlassian.com/software/jira


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to