Hi.

I've earlier written some serialization code that used the SAX API of
Xerces-C for reading, and a modified SAXPrint example for writing stuff
back out. 

I've started blowing the dust of this thing. Last time I made sure it
compiled and ran with Xerces-C was with version 1.3.0, so now with 1.5.0
and G++3.0, I decided to clean stuff up.

The problem is local codepage characters. 
I have tried reading in a file that contains Norwegian specific
characters (���), but I keep getting a segmentation fault in strlen,
deep within the SAX code:

<STACKTRACE>
#0  0x403e8361 in strlen () from /lib/libc.so.6
#1  0x0806fc91 in std::char_traits<char>::length(char const*) (__s=0x0)
    at /usr/include/g++-v3/bits/char_traits.h:158
#2  0x08071906 in std::string::append(char const*) (this=0xbfffeed0,
__s=0x0)
    at /usr/include/g++-v3/bits/basic_string.h:473
#3  0x0808cf21 in std::string::operator+=(char const*) (this=0xbfffeed0,
    __s=0x0) at /usr/include/g++-v3/bits/basic_string.h:457
#4  0x08065286 in txos::SAXDeserializerContext::characters(unsigned
short const*, unsigned) (this=0xbfffee80, chars=0x80efe58, length=9)
    at src/SAXDeserializerContext.cpp:377
#5  0x4011629b in SAXParser::docCharacters(unsigned short const*,
unsigned, bool) () from
/home/trustix/devel/kentda/devel/cpp/xml/xerces/lib/libxerces-c1_5.so
#6  0x40144147 in XMLScanner::sendCharData(XMLBuffer&) ()
   from
/home/trustix/devel/kentda/devel/cpp/xml/xerces/lib/libxerces-c1_5.so
#7  0x40146083 in XMLScanner::scanCharData(XMLBuffer&) ()
   from
/home/trustix/devel/kentda/devel/cpp/xml/xerces/lib/libxerces-c1_5.so
#8  0x4014aa75 in XMLScanner::scanContent(bool) ()
   from
/home/trustix/devel/kentda/devel/cpp/xml/xerces/lib/libxerces-c1_5.so
#9  0x40148b81 in XMLScanner::scanDocument(InputSource const&, bool) ()
   from
/home/trustix/devel/kentda/devel/cpp/xml/xerces/lib/libxerces-c1_5.so
#10 0x40148a59 in XMLScanner::scanDocument(unsigned short const*, bool)
()
   from
/home/trustix/devel/kentda/devel/cpp/xml/xerces/lib/libxerces-c1_5.so
#11 0x40115cad in SAXParser::parse(unsigned short const*, bool) ()  
</STACKTRACE>

I have tried with the encoding both in ISO8859-1 and UTF8, running the
textfile thru iconv for conversion, but neither works:


---------------8<------------------
<?xml version="1.0" encoding="UTF8"?>
<person>
  <name>Ola Børre</name>
</person>

---------------8<------------------
<?xml version="1.0" encoding="ISO8859-1"?>
<person>
  <name>Ola B�rre</name>
</person>
---------------8<------------------

Is this a known problem with G++3.0, Xerces-C 1.5.0, the old SAX API or
some FAQ I've missed?

-- 
<[ Kent Dahl ]>================<[ http://www.stud.ntnu.no/~kentda/ ]>
  )____(stud.techn.;ind.�k.data)||(softwareDeveloper.at(Trustix))_(
 /"Opinions expressed are mine and not those of my Employer,      "\
( "the University, my girlfriend, stray cats, banana fruitflies,  " )
 \"nor the frontal lobe of my left cerebral hemisphere.           "/

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to