On Tue, Nov 15, 2005 at 11:38:33AM +1100, Michael Day wrote:
> 
> Hi Daniel,
> 
> The invalid comment in wired.html is this:
> 
>     <!------TRADES--------->
> 
> Because it has an odd number of "--" sequences the comment is actually not
> terminated according to the SGML rules.
> 
> Web browsers will actually parse this comment differently depending on
> whether they are using standards-mode or quirks-mode to parse the
> document.
> 
> I have attached an HTML document that demonstrates the issue. If you open
> it in Mozilla, it will be parsed in standards-mode because it has a
> DOCTYPE declaration. In this case the comment will not be terminated and
> some of the document text will be hidden. If you delete the DOCTYPE it
> will be parsed in quirks-mode, the comment will be terminated and the text
> will be shown.
> 
> I cannot think of any way to detect comment termination that will handle
> both cases correctly without adding a quirks-mode feature to the libxml
> HTMLparser; there is no other way to parse old HTML and new HTML and get
> them both right.
> 
> Would it be reasonable for me to add a quirks-mode flag to the HTML parser
> that would only toggle comment parsing behaviour for now?

  I would just use the existing HTML_PARSE_RECOVER mode flag for this, though
in a sense I would have preferred the default behaviour to be maintained.
I really think that a wrong count number of '-' in comments is a frequent
mistake and even if SGML suggest it is not ended we should not miss the
start tag on the next element. This is a too benign error, and the effects
are too strong with the new code, this feels unbalanced especially as it
is a change from the current behaviour.
  I don't know how to best handle this...

Daniel

-- 
Daniel Veillard      | Red Hat http://redhat.com/
[EMAIL PROTECTED]  | libxml GNOME XML XSLT toolkit  http://xmlsoft.org/
http://veillard.com/ | Rpmfind RPM search engine http://rpmfind.net/
_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
[email protected]
http://mail.gnome.org/mailman/listinfo/xml

Reply via email to