CRLF is translated to LF in scanCharData
----------------------------------------
Key: XERCESC-1361
URL: http://issues.apache.org/jira/browse/XERCESC-1361
Project: Xerces-C++
Type: Bug
Components: SAX/SAX2
Versions: 2.6.0
Environment: win2k, Xerces-c 2.6(build the src with vc6+sp5) and Xerces-c 2.1
binary version
Reporter: ding hua
When i parse a simple xml document. there is a CRLF between aaa and bbb. But
saxparse call method characters, the string is translated to aaa LF bbb. It
loses the char CR.
<?xml version="1.0" encoding="gb2312" standalone="no"?>
<dd><ddrow><text>aaa
bbb</text>
</ddrow></dd>
And i trace the code, i find the char is eated up by handleEOL. I want keep
the content unchanged. Is it reasonable? Thanks.
The call stack
xercesc_2_6::XMLReader::handleEOL(unsigned short & 0x000d, unsigned char 0x00)
line 898
xercesc_2_6::XMLReader::getNextCharIfNot(const unsigned short 0x003c, unsigned
short & 0x000d) line 789
xercesc_2_6::ReaderMgr::getNextCharIfNot(const unsigned short 0x003c, unsigned
short & 0x000d) line 398
xercesc_2_6::IGXMLScanner::scanCharData(xercesc_2_6::XMLBuffer & {...}) line
2630 + 17 bytes
xercesc_2_6::IGXMLScanner::scanContent() line 837
xercesc_2_6::IGXMLScanner::scanDocument(const xercesc_2_6::InputSource & {...})
line 204 + 8 bytes
xercesc_2_6::SAXParser::parse(const xercesc_2_6::InputSource & {...}) line 720
internal\XMLReader.hpp Ln895
if ( fCharBuf[fCharIndex] == chLF ||
((fCharBuf[fCharIndex] == chNEL) && fNEL) )
{
fCharIndex++;
}
--
This message is automatically generated by JIRA.
-
If you think it was sent incorrectly contact one of the administrators:
http://issues.apache.org/jira/secure/Administrators.jspa
-
If you want more information on JIRA, or have a bug to report see:
http://www.atlassian.com/software/jira
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]