Hi David, Section 4.3.3 of the XML 1.0 spec, 2nd Edition [*], states
[[ It is a fatal error if an XML entity is determined (via default, encoding declaration, or higher-level protocol) to be in a certain encoding but contains octet sequences that are not legal in that encoding. ]] So an option like this would run directly counter to the spec. And would promote non-interoperability with other XML processors, which can be expected to reject such documents. It's just too bad that Xerces-C was non-conformant in this respect for so long. It's to be regretted that this will cause some folks some pain. But it's pain that would be felt when the documents have to be fed to some other processor anyway, so I guess it's just a matter of now or later... Cheers, Neil [*]: http://www.w3.org/TR/REC-xml#charencoding Neil Graham XML Parser Development IBM Toronto Lab Phone: 905-413-3519, T/L 969-3519 E-mail: [EMAIL PROTECTED] |---------+----------------------------> | | David Schulze | | | <[EMAIL PROTECTED]| | | om> | | | | | | 08/28/2003 02:30 | | | PM | | | Please respond to| | | xerces-c-dev | | | | |---------+----------------------------> >---------------------------------------------------------------------------------------------------------------------------------------------| | | | To: "'[EMAIL PROTECTED]'" <[EMAIL PROTECTED]> | | cc: | | Subject: 12436 - UTF-8 transcoder is not strict (and therefore not secure) . | | | | | >---------------------------------------------------------------------------------------------------------------------------------------------| 12436 - UTF-8 transcoder is not strict (and therefore not secure). Xerces 2.3.0 fixed bug #12436, unfortunately my company has shipped some XML files that do not conform to this. (The trade-mark symbol) So newer code that uses version 2.3.0 cannot parse these older files. Is there a way for me to turn off the validity checking of multi-byte sequences when using UTF-8 that does not require modification to the XML file itself? Perhaps some switch I can set before beginning a parse? Thanks for any help. David Schulze DeLorme Mapping Yarmouth, Maine, USA --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]