As far as I can tell surrogate pairs aren't correctly processed in attributes, PIs and comments. I filed http://nagoya.apache.org/bugzilla/show_bug.cgi?id=2752 in July Does anyone know if there are any fundamental problems that make this difficult to implement? I would be willing to fix it unless there is more to the problem than I am aware of. It seems like a simple fix to me. I am most interested in attributes. There are two places where attribute values get trapped. In both XMLScanner::basicAttrValueScan and XMLScanner::scanAttValue there is a call to XMLReader::isXMLChar which fails because surrogates are not xml characters. I'm sort of confused by this particularly in XMLScanner::basicAttrValueScan where it is inside the surrogate handling code anyway. I think all that needs to happen is to make this code not trigger an error for surrogate characters. Comments and PIs need more attention because it doesn't look like there is any support for surrogates currently in place. Chris Hill --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
