Title: RE: filenames versus URI's

The one place where I know the difference matters is in
  XMLReader.setProperty(
    "http://apache.org/xml/properties/schema/external-schemaLocation",
    schemaLocations)

The schemaLocations String is a space separated list of (publicID URI)
pairs. The syntax is the same as for schemaLocation attributes in instance
documents.

The problem here is that URIs absolutely cannot contain spaces in
this list (since a space is a delimiter).

That means that tolerating spaces in URIs can only be done
sometimes.  I think I'd vote for consistency, and aim for
strict adherence to the URI rules.  A utility method that
can convert filenames to URIs would be a nice addition, though.

-----Original Message-----
From: [EMAIL PROTECTED] [mailto:[EMAIL PROTECTED]]
Sent: Tuesday, September 10, 2002 9:11 AM
To: [EMAIL PROTECTED]; [EMAIL PROTECTED]
Subject: filenames versus URI's


Hi all,

There are a number of places where the parser has to interact with the file
system (e.g., in resolving systemId's, schemaLocation hints and Strings
supplied to our JAXP #parse methods.)  To my knowledge, all of these
situations are expecting a URI--possibly relative--rather than a filename.

Historically--at least in recent history--we've been more and more
permissive in what we'll accept here.  We can usually figure out, for
instance, that "c:\myfile.xml" maps to file:///c/myfile.xml.  But recently,
there have been a deluge of reports that we can't handle filenames with
spaces or other characters disallowed by the URI spec, or that non-ASCII
characters can't be processed.

It would be possible--in rrinciple--to keep on becoming more accomodating.
It would make our code more complex, and for things like Chinese characters
it isn't clear that that complexity wouldn't be rather substantial.  Or, we
could change course and decide to allow only true URI's to be used
consistently, and restrict ourselves to making sure we can absolutize
relative URI's correctly in whatever context they're given.

What do people think?  Is it too much to ask of applications to provide
URI's rather than platform-dependent filenames?  Do people think increasing
the complexity of our stream-processing code is worth whatever convenience
is gained?  Is it acceptable that, by allowing filenames, we're violating
the letter of many specifications and probably not aiding the cause of
platform/parser independence, since we're being more permissive than other
products are likely to be?

All thoughts appreciated!

Cheers,
Neil
Neil Graham
XML Parser Development
IBM Toronto Lab
Phone:  905-413-3519, T/L 969-3519
E-mail:  [EMAIL PROTECTED]



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Reply via email to