Hi everyone, I would be curious to know which URI tests are failing.
Here's a summary of changes that I've made: 1. '[' and ']' added in RFC 2732, are not allowed in path segments. 2. No URI can begin with a ':'. 3. The scheme specific part of a URI cannot be empty, so any URIs of the form scheme: or scheme:#fragment are not valid according to the BNF in RFC 2396. 4. Fixed relative URI resolution in the case where the base URI has a null path. (This shouldn't show up in schema validation.) 5. Whitespace (even escaped as %20) is not permitted in the authority portion of a URI. 6. IPv4 addresses must match 1*3DIGIT "." 1*3DIGIT "." 1*3DIGIT "." 1*3DIGIT. Since RFC 2732. 7. IPv4 addresses are 32-bit, therefore no segment may be larger than 255. This isn't expressed by the grammar. 8. Hostnames cannot end with a '-'. 9. Labels in a hostname must be 63 bytes or less [RFC 1034]. 10. Hostnames may be no longer than 255 bytes [RFC 1034]. (That restriction was already there. I just moved it inwards. 11. Added support for IPv6 references added in RFC 2732. URIs such as http://[::ffff:1.2.3.4] are valid. The BNF in RFC 2373 isn't correct. IPv6 addresses are read according to section 2.2 of RFC 2373. Changes 6-10 tightened the checking of the host portion of the authority. Adding support for registry-based authority [RFC 2396 - section 3.2.1] will permit the URIs that would be rejected by changes 6-10. The BNF in RFC 2396 is ambiguous in terms of the path and authority components, meaning a path component can start with '//', which is usually before the authority. The ambiguity is resolved in section 4.3 of RFC 2396. Currently the URI implementation will only try to match authority if it sees a URI beginning with scheme://, instead of trying to match the path portion if it cannot be an authority. Fixing this would permit URIs the would be rejected by change #5. For example, scheme://%20whitespace%20 is valid, where //%20whitespace%20 is the path portion. Perhaps the problem might be with change #3. Appearently 'DAV:' is a valid URI, though the grammar doesn't permit it. See discussion at: http://www.apache.org/~fielding/uri/rev-2002/issues.html#014-empty-opaque_part. On Mon, 21 Jul 2003, Neil Graham wrote: > Hi all, > > Recently, Michael's been working hard to make our formerly rather woeful > URI implementation conform more closely to the relevant RFC's. I just > noticed some JAXP TCK tests that try and test the Schema anyURI type that > have started to fail as a result. Now I have no tremendous expertise in > the area of URI validation, but from what I've gleaned so far, what > Michael's done looks quite correct. > > I know there's lots of Sun folks on the list; I wonder if anyone would be > willing to run the TCK and bring to our attention any areas in which the > new code doesn't appear to conform to the Schema specs? With a release > scheduled for the end of next week, it seems pretty important to straighten > this out as soon as possible; it would certainly be unfortunate if any > correct changes had to be pulled back because of a TCK-compliance issue. > > Cheers! > Neil > Neil Graham > XML Parser Development > IBM Toronto Lab > Phone: 905-413-3519, T/L 969-3519 > E-mail: [EMAIL PROTECTED] > > > > --------------------------------------------------------------------- > To unsubscribe, e-mail: [EMAIL PROTECTED] > For additional commands, e-mail: [EMAIL PROTECTED] > > -------------------- Michael Glavassevich [EMAIL PROTECTED] --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]