DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT <http://nagoya.apache.org/bugzilla/show_bug.cgi?id=12772>. ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND INSERTED IN THE BUG DATABASE.
http://nagoya.apache.org/bugzilla/show_bug.cgi?id=12772 Xerces J2 is not correctly treating UTF-8 encoded characters in patterns. Summary: Xerces J2 is not correctly treating UTF-8 encoded characters in patterns. Product: Xerces2-J Version: 2.0.1 Platform: PC OS/Version: Linux Status: NEW Severity: Normal Priority: Other Component: XML Schema datatypes AssignedTo: [EMAIL PROTECTED] ReportedBy: [EMAIL PROTECTED] Xerces J2, and Xerces J1, are not correctly treating UTF-8 encoded characters in patterns. Errant behaviour observed in use of pattern, and encoding of euro character (files attached). The schema pattern is recognised if encoded as an entity reference, but the UTF-8 encoded euro character is split into two characters and the file validated as though the pattern consisted of these two characters, rather than the single, UTF-8 encoded, euro character. So, with 1) a pattern in a schema consisting of a euro in UTF-8 encoding, surrounded by square brackets - [e] where e is UTF-8 euro, and 2) a euro in an instance coded either as an entity reference, € or as UTF-8, then the instance is not seen as matching the pattern. If the pattern is [€] then the instance is validated correctly. Result from validating attached notEuros2.xml against attached notEuros.xsd [Error] file: null notEuros2.xml:3:25: cvc-type.3.1.3: The value '?' of element 'AsUTF8' is not valid. thanks Reuben --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
