DO NOT REPLY [Bug 5085] - Reporting of externally specified encodings

bugzilla Mon, 26 Nov 2001 22:04:15 -0800

DO NOT REPLY TO THIS EMAIL, BUT PLEASE POST YOUR BUG 
RELATED COMMENTS THROUGH THE WEB INTERFACE AVAILABLE AT
<http://nagoya.apache.org/bugzilla/show_bug.cgi?id=5085>.
ANY REPLY MADE TO THIS MESSAGE WILL NOT BE COLLECTED AND 
INSERTED IN THE BUG DATABASE.


http://nagoya.apache.org/bugzilla/show_bug.cgi?id=5085

Reporting of externally specified encodings





------- Additional Comments From [EMAIL PROTECTED]  2001-11-26 22:05 -------
If the parser isn't ever using an externally specified encoding, then I would 
suggest that is a bug in the parser.  See

http://www.w3.org/TR/REC-xml#sec-guessing-with-ext-info

and RFC 3023.  If the parser is reading the file over HTTP, it should clearly 
use the charset parameter in the Content-Type header, which means that the 
declared encoding and the actual encoding may be different.

If the parser got the entity as a Reader, then it may still be able to report 
the encoding, if the EntityResolver set the encoding field in the InputSource.

Although a user-defined EntityResolver is not part of the parser and is thus 
part of the "application", it is often a totally separate module from the part 
of the application that handles the DocumentHandler events.  A particular 
entity resolver may be common to many applications.  It is not desirable to 
force applications to use the EntityResolver to get information about the 
infoset of a document.

Note that the DOM Level 3 has actualEncoding and encoding properties on the 
document method.  I don't see how Xerces can currently support the 
actualEncoding property correctly.

---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]

DO NOT REPLY [Bug 5085] - Reporting of externally specified encodings

Reply via email to