Since no-one responded so far, I filed a bug report on this. http://bugzilla.gnome.org/show_bug.cgi?id=495213
Stefan Stefan Behnel wrote: > I noticed a problem with the new way libxml2 2.6.29+ handles the HTML "embed" > tag. It serialises it without the enclosing tag, which then lets following > attempts to parse the document fail, as the information where the tag is > closed gets lost. Here's an example: > > $ cat embed.html > <html><body> > <embed src="http://www.youtube.com/v/183tVH1CZpA" > type="application/x-shockwave-flash"></embed> > <embed src="http://anothersite.com/v/another"></embed> > <script src="http://www.youtube.com/example.js"></script> > <script src="/something-else.js"></script> > </body></html> > > $ xmllint --html embed.html > embed2.html > > $ cat embed2.html > <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" > "http://www.w3.org/TR/REC-html40/loose.dtd"> > <html><body> > <embed src="http://www.youtube.com/v/183tVH1CZpA" > type="application/x-shockwave-flash"><embed > src="http://anothersite.com/v/another"><script > src="http://www.youtube.com/example.js"></script><script > src="/something-else.js"></script> > </body></html> > > $ xmllint --html embed2.html > embed3.html > > $ cat embed3.html > <!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN" > "http://www.w3.org/TR/REC-html40/loose.dtd"> > <html><body> > <embed src="http://www.youtube.com/v/183tVH1CZpA" > type="application/x-shockwave-flash"><embed > src="http://anothersite.com/v/another"><script > src="http://www.youtube.com/example.js"></script><script > src="/something-else.js"></script></embed></embed> > </body></html> > > Note that the "script" tags have moved into the "embed" tag, although > originally they were siblings. > > I think the place to fix this is the serialiser rather than the parser. It > should always emit a closing tag here. > > Stefan > > _______________________________________________ > xml mailing list, project page http://xmlsoft.org/ > [email protected] > http://mail.gnome.org/mailman/listinfo/xml > From - Mon _______________________________________________ xml mailing list, project page http://xmlsoft.org/ [email protected] http://mail.gnome.org/mailman/listinfo/xml
