I think that the parser is correct. The example given is of two NMTOKEN values separated by char refs that resolve to new lines. Normalization happens after character ref expansion, and the normalization indicates that all whitespace should be reduced to a single space. And besides, there are LOTS of attribute normalization tests in the various test suites and the parser doesn't have any failures on that stuff, so I'm relatively confident that (unless something has changed for the worse recently) its working correctly. So this: <normNames attr="A


B"/> Becomes: <normNames attr='A\r\r\rB"/> after the char refs are expanded. And then the whitespace is folded down to single spaces, which leaves: <normNames attr="A B"/> -------------- Dean Roddey Software Geek Extraordinaire Portal, Inc [EMAIL PROTECTED] -----Original Message----- From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]] Sent: Thursday, April 05, 2001 10:58 AM To: [EMAIL PROTECTED] Subject: [Bug 1236] New - incorrect NMTOKENS attribute normalization http://nagoya.apache.org/bugzilla/show_bug.cgi?id=1236 *** shadow/1236 Thu Apr 5 10:57:56 2001 --- shadow/1236.tmp.11194 Thu Apr 5 10:57:56 2001 *************** *** 0 **** --- 1,57 ---- + +=========================================================================== =+ + | incorrect NMTOKENS attribute normalization | + +--------------------------------------------------------------------------- -+ + | Bug #: 1236 Product: Xerces-C | + | Status: NEW Version: 1.4 | + | Resolution: Platform: PC | + | Severity: Critical OS/Version: | + | Priority: Component: Non-Validating Parser | + +--------------------------------------------------------------------------- -+ + | Assigned To: [EMAIL PROTECTED] | + | Reported By: [EMAIL PROTECTED] | + +--------------------------------------------------------------------------- -+ + | URL: | + +=========================================================================== =+ + | DESCRIPTION | + Xerces 1.4 generates incorrect output for + Normalization of Attribute that are NMTOKENS. + The attribute value stripped out too much character + reference. (re: XML Specification 1.0 section 3.3.3 + Attribute-value normalization) + + I compiled DOMPrint example with + Xerces 1.4 using MSDev 6.0 on Windows NT: + + using the following test case attr.xml: + <!DOCTYPE normNames [ + <!ELEMENT normNames EMPTY> + <!ATTLIST normNames attr NMTOKENS #IMPLIED> + ]> + <normNames attr="A


B"/> + + I got the following output: + <!DOCTYPE normNames [ + <!ELEMENT normNames EMPTY> + <!ATTLIST normNames attr NMTOKENS #IMPLIED> + ]> + <normNames attr="A B"/> + + But the expected output according to the XML Specification + is + <!DOCTYPE normNames [ + <!ELEMENT normNames EMPTY> + <!ATTLIST normNames attr NMTOKENS #IMPLIED> + ]> + <normNames attr="A #A #A #A B"/> + + In fact, Xerces 1.4 does not seem to generate the + correct output for the last two examples in section + 3.3.3 of XML Specification 1.0. The last two + examples are: + + * a="&d;&d;A&a;&a;B&da;" + * a="

A

B
" + + thanks! + + --Michele --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
