Tim Starling wrote:
>The HTML 5 spec makes a distinction between valid, balanced HTML and
>error recovery algorithms. Browsers are basically the only clients
>able to handle moderately bad HTML, and as I've previously said in
>discussions of HTML 5 output, I don't think it is acceptable to screw
>over all non-browser clients by sending output that relies on obscure
>details of the HTML 5 spec. I think XHTML or something close to it is
>an appropriate machine-readable output format.

Machine-readable output format? Are you suggesting that there would be a
change from the current policy of telling everyone who screen-scrapes HTML
not to ever do it and to instead use api.php? Otherwise, given that the
majority of our actual traffic comes from actual browsers, as I understand
it, I'm not sure I see which clients you're trying to serve.

>And, as several others have noted, you can't just disable Tidy, since
>the effects of unclosed tags are not confined to the content area, and
>there is a large amount of existing content that depends on it. I have
>seen the effects of Tidy being accidentally disabled on the English
>Wikipedia, it is not pleasant.
>
>Am I correct in saying that MZMcBride is the only person in this
>thread in favour of the idea of getting rid of HTML cleanup?

I think it depends what you mean by "HTML cleanup." Are you referring only
to "fixing" mismatched HTML elements or are you also referring to
reimplementing all of the other behavior that Tidy brings in?

Bartosz wrote:
> We really do need this feature. Not anything else that Tidy does, most
>of its behavior is actually damaging, but we need to match the open and
>close tags to prevent the interface from getting jumbled.

My reading of this thread is that this is the consensus view. The problem,
as I see it, is that Tidy has been deployed long enough that some users
are also relying on all of its other bad behaviors. It seems to me that a
replacement for Tidy either has to reimplement all of its unwanted
behaviors to avoid breakage with current wikitext or it has to break an
unknown amount of current wikitext.

MZMcBride



_______________________________________________
Wikitech-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to