Hi All,
As of right now, Xerces2 doesn't recognize JAVA encoding names
by default. Xerces2 assumes that encoding names *must* be IANA encoding names
and throws fatal error in other cases.
As per XML 1.0 specification,
http://www.w3.org/TR/REC-xml#sec-entity-decl
<snip>
"It is recommended that character encodings registered (as charsets) with the
Internet Assigned Numbers Authority [IANA-CHARSETS], other than those just
listed, be referred to using their registered names."
</snip>
The spec recommends that IANA names be used, but does not *require* it.
IMO, behavior of Xerces2 throwing "Fatal Error" is very stringent or i would say
not right, as encoding names to be IANA names is not among the required behavior
of parser. If required, parser may report warning to the application. Throwing
fatal error stops the further processing of document and that document can not
be processed, even when XML 1.0 specification provides flexibility (encourages)
for the processors to be able to determine the encoding externally. It is easy
for processor to know if particular encoding is supported by underlying JVM.
Also, XML 1.0 specification states that "It is recognized that other encodings
are used around the world, and it may be desired for XML processors to read
entities that use them." If one reads the E23 section of erratum..
http://www.w3.org/XML/xml-V10-2e-errata
"It was always the intent of the XML 1.0 spec to allow the character encoding to
be determined externally." Considering this it would be fair assumption on part
of user for Xerces2 to be able to process XML documents using JAVA encoding anme
when it is supported by underlying JVM. It should not be restricted to JAVA
encodings and can be any encoding 'X' for which custom readers may be written
(or somehow made available ) for Xerces2 as support for other international
encoding is provided. Processor should throw fatal error if parser is still not
able to process an entity with particular encoding as required by XML 1.0
specification.
IMO, Xerces2 behavior be changed to accept JAVA encodings as supported by
underlying JVM.
What do other developers and members of community think ?
Thanks
Neeraj
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]