Okay, I'll answer my own question: 1. The character /u2019 will not be converted to a character reference when UTF-8 is used (it will use two bytes and will not be displayed correctly in applications that do not correctly deal with UTF-8 - e.g. Windows notepad). 2. In the cases where character references are used an editing component is causing them to be encoded - the component is not being used in the places where the characters are not encoded. 3. Windows file encodings are a PITA. 4. I know more now than I did before.
Sorry for the noise. Scott -- Scott Eade Backstage Technologies Pty. Ltd. http://www.backstagetech.com.au .Mac Chat/AIM: seade at mac dot com On 21/02/2003 6:42 PM, "Scott Eade" <[EMAIL PROTECTED]> wrote: > I have had a brief scan of the mail archive and not come across anything > like this, but that said, I am not sure of exactly where this problem bight > be coming from. > > Here is what I have: > 1. Some data in a MySQL database that contains "right single quotation > marks" (UTF Hex 2019) - thanks to the content being pasted in from MS Word. > 2. The data is included in a CDATA section in a jdom-b8 tree. > 3. A jdom XMLOutputter created with the encoding set to UTF-8 > XMLOutputter outputter = new XMLOutputter(" ", true, "UTF-8"); > 4. A HttpServletResponse with ContentType set to "text/xml; charset=UTF-8". > HttpServletResponse response = whatever...; > response.setContentType("text/xml; charset=UTF-8"); > 5. The Writer for the response is used to output the content > outputter.output(doc, response.getWriter()); > response.flushBuffer(); > > Now the trouble is that the /u2019 characters do not seem to be written > correctly to the output stream (I am expecting to see "’" as a > replacement for these characters, but instead I am seeing the square block > placeholder - platform is win2k). > > I am at a loss of what to try. I have gone from jdom-b7 to jdom-b8 and from > xercesj-1.3.0 to xercesj-2.0.2 to xercesj-2.3.0 and the problem persists. > > Interestingly some other characters are being correctly converted to their > character entity references, but then sometimes they are not in the same > document. > > Any clues would be most welcome. I'll probably try the jdom list as well. > > Thanks in advance for any replies. > > Cheers, > > Scott --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
