On 6 May 2005, at 12:03, Jochen Wiedmann wrote:


For version 3, I have code ready that checks the presence of Java 1.4. It that is available, an instance of Charset is being queried.

Yes that works fine - I'm too used to living with the need to support J2ME. I forget the nice things in 1.4 :)





For maximum interoperability I would suggest we use UTF-8 but use
character references for all values > 0X7F. This means that even if the
other end gets the encoding wrong it will still almost certainly
understand the characters. If the other end does not understand
character encodings it will be very easy to see what the problem is
(which is not quite so easy to do if it mistakes UTF-8 for ISO8859-1,
for example)



That is, as far as I can say, what Daniels proposed patch does.



Yes It would appear to do this. However it also seems to emit invalid XML code points as character references (e.g. the NULL character would be emitted as �). I do not believe that the XML spec allows this. I believe that these code points cannot appear in a well formed document in any form. The intent is to allow the consuming application to be 100% sure it never sees these characters.


John Wilson The Wilson Partnership http://www.wilson.co.uk




Reply via email to