How robust is

xmllint --html --xmlout

Is it possible to confuse it so badly it won't continue or will generate 
ill-formed markup? Or will it keep on trucking no matter what?

How does the HTML parser handle bogons (unrecognized elements)? Are they 
treated as empty or dropped or something else?

How good an alternative is this for TagSoup and Tidy?

I'm working on a book about converting messy old HTML to clean XHTML, 
and I'm trying to decide exactly how much of each tool to recommend when.

-- 
Elliotte Rusty Harold  [EMAIL PROTECTED]
Java I/O 2nd Edition Just Published!
http://www.cafeaulait.org/books/javaio2/
http://www.amazon.com/exec/obidos/ISBN=0596527500/ref=nosim/cafeaulaitA/
_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
[email protected]
http://mail.gnome.org/mailman/listinfo/xml

Reply via email to