Treatment of CR in XmlWriter

Goh Tor Meng Thu, 29 Dec 2005 17:49:18 -0800

Hi all,

I have question regarding the treatment of '\r' in the methodchardata(String).

Presently my application is trying to send a pgp signature using xmlrpc.Howeverwe found out that XmlWriter is writing '&#13' when it encounters acarriage return.This is casing a problem on the receiving side during signatureverification.


code snippet from XmlWriter
-----------------------------------------------
          case '\r':
              // Avoid normalization of CR to LF.
              writeCharacterReference(c);
              break;
------------------------------------------------

I understand that '\r' is a valid xml character and theisValidXMLChar(char c) methodalso returns true when '\r' is encountered. But before isValidXMLChar iscalled,

'&#13' has already been written.

I'm trying to understand why this is so. Can the CR be preserved in theXML message

instead of being written as ''&#13' ?

Thanks in advance for all the help.


regards
Tor Meng






Code except from XmlWriter.java
=======================

  protected void chardata(String text)
      throws XmlRpcException, IOException
  {
      int l = text.length ();
      // ### TODO: Use a buffer rather than going character by
      // ### character to scale better for large text sizes.
      //char[] buf = new char[32];
      for (int i = 0; i < l; i++)
      {
          char c = text.charAt (i);
          switch (c)
          {
          case '\t':
          case '\n':
              write(c);
              break;
          case '\r':
              // Avoid normalization of CR to LF.
              writeCharacterReference(c);
              break;
          case '<':
              write(LESS_THAN_ENTITY);
              break;
          case '>':
              write(GREATER_THAN_ENTITY);
              break;
          case '&':
              write(AMPERSAND_ENTITY);
              break;
          default:
              // Though the XML spec requires XML parsers to support
              // Unicode, not all such code points are valid in XML
              // documents.  Additionally, previous to 2003-06-30
              // the XML-RPC spec only allowed ASCII data (in
              // <string> elements).  For interoperability with
              // clients rigidly conforming to the pre-2003 version
              // of the XML-RPC spec, we entity encode characters
              // outside of the valid range for ASCII, too.
              if (c > 0x7f || !isValidXMLChar(c))
              {
                  // Replace the code point with a character reference.
                  writeCharacterReference(c);
              }
              else
              {
                  write(c);
              }
          }
      }
  }

  /**
   * Section 2.2 of the XML spec describes which Unicode code points
   * are valid in XML:
   *
   * <blockquote><code>#x9 | #xA | #xD | [#x20-#xD7FF] |
   * [#xE000-#xFFFD] | [#x10000-#x10FFFF]</code></blockquote>
   *
   * Code points outside this set must be entity encoded to be
   * represented in XML.
   *
   * @param c The character to inspect.
   * @return Whether the specified character is valid in XML.
   */
  private static final boolean isValidXMLChar(char c)
  {
      switch (c)
      {
      case 0x9:
      case 0xa:  // line feed, '\n'
      case 0xd:  // carriage return, '\r'
          return true;

      default:
          return ( (0x20 <= c && c <= 0xd7ff) ||
                   (0xe000 <= c && c <= 0xfffd) ||
                   (0x10000 <= c && c <= 0x10ffff) );
      }
  }

Treatment of CR in XmlWriter

Reply via email to