Re: Java encoding names

Neeraj Bajaj Thu, 30 May 2002 04:22:47 -0700

> Andy Clark wrote:
> > I am curious though as to why setting the
> > http://apache.org/xml/features/allow-java-encodings feature is such a
> > tremendous burden?  
> 
> I also fail to see the reason why we should change the
> default behavior to accept Java encoding names. Xerces2 
> does not prevent anyone from using Java encoding names --



OK, I have tried to put pieces together at one place, as to why i think 
Xerces2 should allow encoding names other than IANA (Java names etc..) by 
default. My decision is influenced by combining different factors like 
convenience, limitation, application portablitiy, XML spec and wide usage etc..


1. Although Xerces provides a feature to turn on support
for Java encodings, using this feature makes applications
dependent on Xerces: attempting to set this feature on
other parsers would result in exceptions.

2. As an open source project, Xerces2 should be usable
_out of the box_ by the widest possible set of users.
By allowing Java encodings by default, Xerces2 would
be usable by applications that need to process XML documents
having Java encoding names. There are quite number of 
applications exist which use Java encoding names 

3. There is no portable way for an application to provide
a byte-to-char converter to the parser based on the encoding
name in the XML declaration in a document. Ideally there
would be a callback from the parser to the application that
allows the application to provide a converter for the encoding
thats in the XML declaration. Since there is no way to
do that today portably, the next best alternative is
for the parser to support all the encodings available
in its environment. Since Xerces2 is Java based, it should
support Java encoding names.

4. XML spec encourages the parser and says it is the desired 
behavior for them be able to process documents encoded in encodings 
other than UTF-8, UTF-16. 

5. Xerces2 already supports IANA encoding names other than
UTF-8 and UTF-16 so its already supporting features that
are optional in the XML1.0 spec and encouraged by it.

6. Most of other parsers like Crimson supports Java encoding names 
by default, so applications using Crimson that want to shift to 
Xerces2 will run into this problem. We should try to accelerate
adoption of Xerces2 by not breaking applications that
today run with Crimson. XMLSpy also supports some of the 
java encoding names by default.

7. Supporting Java encoding names would not break applications 
that use IANA names and would be fully backward compatible. 


I have put my views forward for the community to think and ofcourse 
the community decision wins :-)

                
Thanks,


Neeraj


---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

Re: Java encoding names

Reply via email to