FW: Xerces remapping &#xxxx;

Hello all,

I’m using Xerces 2.7 and I’m trying to parse the following snippet from my XML file:

<title>Junk Mail - just how “heavy” a problem is it?</title>

The xml header/encoding on the file is:

<?xml version="1.0" encoding="UTF-8"?>

When I parse this and walk the DOM and extract the contents of this title node, I get back:

Junk Mail - just how â€œheavyâ€ a problem is it?

Where the special characters are decimal 30,128,100 and 30,128,99

Why is Xerces interpreting the &#xxxx; codes and more importantly, how do I stop it? J

Here is my Xerces setup code:

m_parser = new XercesDOMParser();

m_parser->setValidationScheme( XercesDOMParser::Val_Never );

m_parser->setDoNamespaces( false );

m_parser->setDoSchema( false );

m_errorHandler = (ErrorHandler*) new HandlerBase();

m_parser->setErrorHandler( m_errorHandler );

Hope someone can help, thanks a lot!!

Graeme Ing

Reply via email to