Re: [xml] Apparently incorrect paragraph wrapping in HTML parser

iSteve Thu, 12 Jan 2006 05:35:45 -0800

  Yes, thanks ! That sounds the right approach to me, I would just turn
merge that with a new htmlParserOption HTML_PARSE_STRICT, which could be
either passed by the user to maintain the current behaviour or activated by
default when the DOCTYPE is read if it happen to be a Strict HTML one.

Yes, checking the DTD is indeed an option; though I'm not sure how itwould handle case in which I link a DTD myself?

Eg.:

<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01//EN""http://very.silly/html401-like/but/not/exactly/strict.dtd";>

Anyway, I do not see any reason why parser should mess with the documentin first place; it's supposed to parse it, not alter it deliberatelyaccording to what it thinks that may be the right solution. Couldsomeone please explain me why to alter the document?

And please, do not say "to be compliant with standards", becausestandards to my best knowledge do not require the parser to "fix" thedocument (though I may be wrong, I doubt standards would require such athing) by adding tags in case it's not considered correct.


 -- iSteve

PS.: The <p> tag injection is not correct anyway. "<img>" tag is inline,yet, not wrapped into <p>. Still want to keep it?


For details, see: http://www.w3.org/TR/REC-html40/sgml/dtd.html#inline

_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
[email protected]
http://mail.gnome.org/mailman/listinfo/xml

Re: [xml] Apparently incorrect paragraph wrapping in HTML parser

Reply via email to