Hello,
I am working on a project where we are using either libxslt or xalan for xslt 
transformations.
We have internally deprecated xalan because libxslt is considerably faster, and 
all other xml processing is performed by libxml2.
We now would like to drop xalan completely, but there is one important case 
where both libraries are producing a different output, which prevents us from 
doing so.

Consider whatever xml file and the following style sheet :


<xsl:stylesheet xmlns:xsl="http://www.w3.org/1999/XSL/Transform"; version='1.0'>
<xsl:output method="html"/>
<xsl:variable name="apache">&lt;!--apache-stuff--></xsl:variable>
<xsl:variable name="script">&amp;{My script};</xsl:variable>

<xsl:template match="/">
    <a href="{$apache}/page.html" onMouseUp="{$script}">link</a>
</xsl:template>

</xsl:stylesheet>



libxml2/libxslt currently produce the following file from the transformation:

<a href="&lt;!--apache-stuff--&gt;/page.html" onMouseUp="&amp;{My 
script};">link</a>


And Xerces/Xalan are producing:

<a href="<!--apache-stuff-->/page.html" onMouseUp="&{My script};">link</a>


The <!--apache-stuff--> part is supposed to be replaced by the web server for 
load balancing purpose, but this is not happening when using libxslt because of 
the escaping (&lt; &gt;),
And that is the issue we're running into.

I have tracked it down, and the problem lies within libxml2, not libxslt (hence 
why I am posting on this list!), when the node tree is serialized to text. The 
enclosed patches are fixing this, and are also implementing a TODO that you had 
in the code:

The html output method should not escape a & character occurring in an 
attribute value immediately followed by a { character (see Section B.7.1 of the 
HTML 4.0 Recommendation).

This is illustrated by the &{My script} part in the example above.

To get back to my issue however, I am not completely sure which behavior is 
actually correct, as I could not find if '<' and '>' are allowed in attribute 
values in html (I know '<' is forbidden in xml).
I run the regression tests, but they added to my confusion:
Some html tests are now failing in the test suite (runtest), but if I run:
./testHTML test/HTML/lt.html
Then the  output is a lot closer to the input file test/HTML/lt.html, which was 
not the case before, so this may mean an improvement.
If this is indeed correct, I'm of course open to any suggestion or comment you 
may have about the patches, they should apply cleanly to the git trunk.

Thank you for your work on the libraries :)
Romain

P.S: timsort.h is missing from the downloadable hourly git snapshot: 
libxml2-git-snapshot.tar.gz

_________________________________________________________________________________________________________________________

Ce message et ses pieces jointes peuvent contenir des informations 
confidentielles ou privilegiees et ne doivent donc
pas etre diffuses, exploites ou copies sans autorisation. Si vous avez recu ce 
message par erreur, veuillez le signaler
a l'expediteur et le detruire ainsi que les pieces jointes. Les messages 
electroniques etant susceptibles d'alteration,
France Telecom - Orange decline toute responsabilite si ce message a ete 
altere, deforme ou falsifie. Merci.

This message and its attachments may contain confidential or privileged 
information that may be protected by law;
they should not be distributed, used or copied without authorisation.
If you have received this email in error, please notify the sender and delete 
this message and its attachments.
As emails may be altered, France Telecom - Orange is not liable for messages 
that have been modified, changed or falsified.
Thank you.

Attachment: entities.c.patch
Description: entities.c.patch

Attachment: entities.h.patch
Description: entities.h.patch

Attachment: HTMLtree.c.patch
Description: HTMLtree.c.patch

Attachment: tree.c.patch
Description: tree.c.patch

Attachment: tree.h.patch
Description: tree.h.patch

_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
[email protected]
https://mail.gnome.org/mailman/listinfo/xml

Reply via email to