On Sun, 24 Jan 2010 23:20:42 +0100, Nick Wellnhofer <[email protected]>
wrote:

[...]It seems that the default behavior of libxml is to encode "\r" as "&#13;". But there is an exception for HTML in xmlEncodeEntitiesReentrant in entities.c. I haven't checked, but looking

This would confirm our assumption that it's libxml which treats \r differently depending on the output format.

at the source the XHTML serialization code seems to call xmlEscapeContent in xmlIO.c. There's also xmlEscapeEntities in xmlsave.c but that uses hex char refs. Those two functions don't make an exception for XHTML content.

Personally, I think libxml shouldn't escape "\r" at all.

As one function distinguishes between HTML and XHTML and the others escape \r I wonder what the use cases looked like. So far it would also make more sense to me if \r is not escaped for XHTML (at least one popular reading system for ePub files - which contain XHTML files - shows a question mark for &#13; entities).

Boris

_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
[email protected]
http://mail.gnome.org/mailman/listinfo/xml

Reply via email to