The actual statement in the XML spec (section 2.4) is "The right angle
bracket (>) may be represented using the string ">", and must, for
compatibility, be escaped using ">" or a character reference when it
appears in the string "]]>" in content, when that string is not marking
the end of a CDATA section." AFAIKS "must" *means* must, even though
this is only for SGML compatibility. If you're going to be compliant
with XML you need to at least look for the "]]>" pattern outside CDATA
and escape the '>' in that particular case. Escaping it everywhere is a
simpler (although higher overhead) solution.
- Dennis
Joseph Kesselman wrote:
In fact, the XML grammar is such that a parser *can't* get confused about
how to interpret the '>' character. > is provided only for stylistic
reasons, because folks thought "<foo>" would express the intent more
clearly to a human reader than "<foo>" would. Unless you plan to
hand-edit your XML documents there really is no reason to escape that
character -- and good reason not to, since doing so adversely impacts
parsing and serialization speed, as well as file size.
I believe the motivation for always escaping '>', had to with to do with
']]>' which is the end delimiter for a CDATA section.
Nope. It's true that ']]>' can't appear within a <[CDATA[]]>, but escaping
doesn't solve that (and in fact > would be "escaped" by the <[CDATA[]]>
and treated as the equivalent of &gt;).
______________________________________
Joe Kesselman, IBM Next-Generation Web Technologies: XML, XSL and more.
"The world changed profoundly and unpredictably the day Tim Berners Lee
got bitten by a radioactive spider." -- Rafe Culpin, in r.m.filk
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
---------------------------------------------------------------------
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]