The actual statement in the XML spec (section 2.4) is "The right angle bracket (>) may be represented using the string ">", and must, for compatibility, be escaped using ">" or a character reference when it appears in the string "]]>" in content, when that string is not marking the end of a CDATA section." AFAIKS "must" *means* must, even though this is only for SGML compatibility. If you're going to be compliant with XML you need to at least look for the "]]>" pattern outside CDATA and escape the '>' in that particular case. Escaping it everywhere is a simpler (although higher overhead) solution.

 - Dennis

Joseph Kesselman wrote:



In fact, the XML grammar is such that a parser *can't* get confused about
how to interpret the '>' character. > is provided only for stylistic
reasons, because folks thought "<foo>" would express the intent more
clearly to a human reader than "<foo>" would. Unless you plan to
hand-edit your XML documents there really is no reason to escape that
character -- and good reason not to, since doing so adversely impacts
parsing and serialization speed, as well as file size.



I believe the motivation for always escaping '>', had to with to do with
']]>' which is the end delimiter for a CDATA section.



Nope. It's true that ']]>' can't appear within a <[CDATA[]]>, but escaping doesn't solve that (and in fact &gt; would be "escaped" by the <[CDATA[]]> and treated as the equivalent of &amp;gt;).

______________________________________
Joe Kesselman, IBM Next-Generation Web Technologies: XML, XSL and more.
"The world changed profoundly and unpredictably the day Tim Berners Lee
got bitten by a radioactive spider." -- Rafe Culpin, in r.m.filk


--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]





---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]



Reply via email to