Re: Java encoding names

neilg Mon, 27 May 2002 07:40:39 -0700

Hi Neeraj,

>Ideally, We should first take user permission before we make any
>attempt to parse such (other than UTF-8, UTF-16) documents to make sure
that
>application/user is aware that by doing so it is tying itself to processor
based
>feature and may not obtain the desired result when shifting to other
processor.


It seems to me that this is the right direction to move in, rather than the
one you're actually proposing.  :-)  The only trouble with it is that it
would represent a significant break with backward compatibility--even
generating a warning makes a new version of the parser instantly
backward-incompatible with old versions for any application that detects
warnings.  So unfortunately I don't think we can go down this road either.

>a) their are encodings available other than those in real world and there
are
>real applications which rely on those encodings because their requirement
goes
>far beyond UTF-8 and UTF-16. And at that time those documents are not
>interoperable any more. Application is relying on the individaul parser
>capablitiy to process their "X" encoding and is very well aware that same
may
>not be the case when shifting to other parser.

My own experience with developers and encodings suggests to me that few
people are very well aware of anything.  :-) To my mind, the really
regrettable problem is that Java chose, at some point in the misty past, to
part company with W3C, IETF and IANA and do its own thing with regard to
encoding names...

>why does Xerces2 limits itself to IANA ? We
>are not completely addressing the problem and solving anything by doing
so.

But we'd be completely throwing our hands up in despair at the problem if
we allowed Java encoding names by default as well as IANA names.  To my
mind, the current situation is as good a compromise as we're likely to
find.

I am curious though as to why setting the
http://apache.org/xml/features/allow-java-encodings feature is such a
tremendous burden?  By the logic in your note, surely no application using
a parser instantiated through a SAX factory or JAXP should rely on that
parser to necessarily support Java encodings...  Surely it will have to
know something about the parser that's actually being employed.

Cheers,
Neil
Neil Graham
XML Parser Development
IBM Toronto Lab
Phone:  905-413-3519, T/L 969-3519
E-mail:  [EMAIL PROTECTED]



---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Java encoding names

Reply via email to