Re: [xml] Parsing tag-soup HTML

Michael Day Sun, 17 Jun 2007 23:52:18 -0700

Hi Nick,

>  Coming back with some kind of definition of what a tag soup parser
> behaviour is is probably more important than digging in libxml2 code.
> I am not sure we can emulate web browser parsers behaviour.


It's worth looking at the HTML5 specification:

http://www.whatwg.org/specs/web-apps/current-work/

Section 8, "The HTML Syntax", is the relevant bit. It still needs some 
work, but it's actively being developed and is a good starting point for 
figuring out how to treat messy real world HTML and hopefully get 
similar behaviour to web browsers.

Best regards,

Michael

-- 
Print XML with Prince!
http://www.princexml.com
_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
[email protected]
http://mail.gnome.org/mailman/listinfo/xml

Re: [xml] Parsing tag-soup HTML

Reply via email to