Hi everyone,

I would be curious to know which URI tests are failing.

Here's a summary of changes that I've made:

1. '[' and ']' added in RFC 2732, are not allowed in path segments.
2. No URI can begin with a ':'.
3. The scheme specific part of a URI cannot be empty, so any URIs of the
form scheme: or scheme:#fragment are not valid according to the BNF in RFC
2396.
4. Fixed relative URI resolution in the case where the base URI has a null
path. (This shouldn't show up in schema validation.)
5. Whitespace (even escaped as %20) is not permitted in the authority
portion of a URI.
6. IPv4 addresses must match 1*3DIGIT "." 1*3DIGIT "." 1*3DIGIT "."
1*3DIGIT. Since RFC 2732.
7. IPv4 addresses are 32-bit, therefore no segment may be larger than 255.
This isn't expressed by the grammar.
8. Hostnames cannot end with a '-'.
9. Labels in a hostname must be 63 bytes or less [RFC 1034].
10. Hostnames may be no longer than 255 bytes [RFC 1034]. (That
restriction was already there. I just moved it inwards.
11. Added support for IPv6 references added in RFC 2732. URIs such as
http://[::ffff:1.2.3.4] are valid. The BNF in RFC 2373 isn't correct. IPv6
addresses are read according to section 2.2 of RFC 2373.

Changes 6-10 tightened the checking of the host portion of the authority.
Adding support for registry-based authority [RFC 2396 - section 3.2.1]
will permit the URIs that would be rejected by changes 6-10.

The BNF in RFC 2396 is ambiguous in terms of the path and authority
components, meaning a path component can start with '//', which is
usually before the authority. The ambiguity is resolved in section 4.3 of
RFC 2396. Currently the URI implementation will only try to match
authority if it sees a URI beginning with scheme://, instead of trying to
match the path portion if it cannot be an authority. Fixing this would
permit URIs the would be rejected by change #5. For example,
scheme://%20whitespace%20 is valid, where //%20whitespace%20 is the path
portion.

Perhaps the problem might be with change #3. Appearently 'DAV:' is a valid
URI, though the grammar doesn't permit it. See discussion at:
http://www.apache.org/~fielding/uri/rev-2002/issues.html#014-empty-opaque_part.

On Mon, 21 Jul 2003, Neil Graham wrote:

> Hi all,
>
> Recently, Michael's been working hard to make our formerly rather woeful
> URI implementation conform more closely to the relevant RFC's.  I just
> noticed some JAXP TCK tests that try and test the Schema anyURI type that
> have started to fail as a result.  Now I have no tremendous expertise in
> the area of URI validation, but from what I've gleaned so far, what
> Michael's done looks quite correct.
>
> I know there's lots of Sun folks on the list; I wonder if anyone would be
> willing to run the TCK and bring to our attention any areas in which the
> new code doesn't appear to conform to the Schema specs?  With a release
> scheduled for the end of next week, it seems pretty important to straighten
> this out as soon as possible; it would certainly be unfortunate if any
> correct changes had to be pulled back because of a TCK-compliance issue.
>
> Cheers!
> Neil
> Neil Graham
> XML Parser Development
> IBM Toronto Lab
> Phone:  905-413-3519, T/L 969-3519
> E-mail:  [EMAIL PROTECTED]
>
>
>
> ---------------------------------------------------------------------
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
>

--------------------
Michael Glavassevich
[EMAIL PROTECTED]

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to