I would also recommend against actively trying to emit barely parsing
output. Any savings after compression should be rather small, and if only
end tags are omitted the DOM will of course still be the same size after
parsing.

In Parsoid we went to some modest lengths
<https://github.com/wikimedia/parsoid/blob/master/lib/XMLSerializer.js> to
produce polyglot markup <http://www.w3.org/TR/html-polyglot/>, which is
both valid XML and HTML5. This has enabled consumers to use either XML or
HTML5 parsers, which has proven very useful in practice. For example, this
makes it easier to consume this content using PHP's libxml. Doing the same
in MediaWiki core is admittedly harder, but I still think that we should
follow the robustness principle
<https://en.wikipedia.org/wiki/Robustness_principle> wherever we can.

Gabriel

On Wed, Feb 18, 2015 at 5:59 PM, Tim Starling <tstarl...@wikimedia.org>
wrote:

> On 19/02/15 08:43, Gergo Tisza wrote:
> > On Wed, Feb 18, 2015 at 1:38 PM, Petr Bena <benap...@gmail.com> wrote:
> >
> >> (Perhaps wgWellFormedXml is true by default?)
> >
> >
> > It is: https://www.mediawiki.org/wiki/Manual:$wgWellFormedXml
>
> There was a Bugzilla report and Gerrit change requesting that it be
> set to false:
>
> https://phabricator.wikimedia.org/T52040
> https://gerrit.wikimedia.org/r/#/c/70036/
>
> I was against it, partly because of the omitted <head> tag.
>
> -- Tim Starling
>
>
> _______________________________________________
> Wikitech-l mailing list
> Wikitech-l@lists.wikimedia.org
> https://lists.wikimedia.org/mailman/listinfo/wikitech-l
>
_______________________________________________
Wikitech-l mailing list
Wikitech-l@lists.wikimedia.org
https://lists.wikimedia.org/mailman/listinfo/wikitech-l

Reply via email to