This discussion piqued my interest (what did Scott mean when he said the whitespace issues are a real mess?), so I did a little research. For those who know as little as I, here's what I found.
Base64 appears to have a number of variants. The basic encoding is consistent, but there's variation in: - what whitespace is allowed, and where, and - whether line wrapping is allowed and/or required. RFC 2045 (MIME Part One) specifies that no more than 76 characters can appear on a line. Line breaks are thus required. Any whitespace (or other unrecognized characters) are to be ignored. XML Schema's base64Binary type is specifically disallowed from enforcing MIME's 76-character limit. (This seems appropriate, since XML documents are not subject to the same transmission constraints as MIME messages.) A single whitespace character is allowed between characters in the base64 alphabet, but other characters outside the base64 alphabet are not allowed. On the other hand, the canonical lexical form of a base64Binary data value lines of 76 base64 characters (except for the last line, which may be less). RFC 3548 (The Base16, Base32, and Base64 Data Encodings) says, "Implementations MUST NOT not add line feeds to base encoded data unless the specification referring to this document explicitly directs base encoders to add line feeds after a specific number of characters." Furthermore, "Implementations MUST reject the encoding if it contains characters outside the base alphabet when interpreting base encoded data, unless the specification referring to this document explicitly states otherwise." (CR and LF are both outside the base64 alphabet.) So, it looks like generic base64 encoders and decoders would have to be told what the whitespace and line wrapping rules are. Rather than have such generic functions, Xerces-C includes functions that attempt to implement what is needed and no more: Schema bas64Binary. I haven't looked at the implementation, but it looks to me like an encoder that generates base64Binary and complies with RFC 3548 would either omit whitespace (including line breaks) entirely OR generate Schema's canonical form. I may have missed something or gotten it wrong, but I hope this summary gives an idea of how much of a mess this stuff is. --------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]
