Nick Kew wrote:
> Stefan Behnel <[EMAIL PROTECTED]> wrote:
>> Nick Kew wrote:
>>> On Mon, 18 Jun 2007 08:14:01 -0400
>>> Try running the following through "xmllint --html":
>>>
>>> <meta http-equiv="content-type" content="text/html;charset=ascii" />
>>> <html lang="en">
>>> <head><title>foo</title></head>
>>> <body><h1>Hello, World</h1></body>
>>> </html>
>> In that case I would actually prefer making it a general special case
>> rule in the current parser to interpret a leading <meta> tag as an
>> encoding hint to the parser. That would add quite a portion of
>> real-world non-HTML to the set of parsable (i.e. fixable) documents.
[...]
> I'm trying to get away from ad-hoc fixes!

I don't consider that an ad-hoc fix. It's just special casing a specific type
of broken HTML that exists in real life. I wouldn't even mind if the <meta>
tag was discarded, it should just

a) be interpreted as an encoding hint
and
b) not change the remaining 'real' markup.

I think such a rule should go into the mainstream parser.

Stefan
_______________________________________________
xml mailing list, project page  http://xmlsoft.org/
[email protected]
http://mail.gnome.org/mailman/listinfo/xml

Reply via email to