I'd say an HTML5 output mode *ought* to work like this: *Don't try to be clever.* * Consistency and predictability are key to both security review and data consumability.
*Quote attributes consistently and predictably.* * Always use double-quotes on attributes in output. *Output specced empty tags in HTML style.* * <img>, <hr>, <br> are fine and not ambiguous at all to an HTML parser. There's no need to go adding a "/" in at the end! * These are already whitelisted in the Html class so it's easy to not mess this up. *Don't do other silly things for old-school XHTML 1.* * CDATA wrapping of <script>s and <style>s is not needed. The only benefit of $wgWellFormedXml was that you could toss your "well-formed" tag soup into an XML parser that didn't grok HTML. I have no idea if that worked reliably or was actually useful to anyone, but it's probably worth confirming that before actually removing the funky self-closing tags. -- brion On Mon, May 2, 2016 at 11:42 AM, Brian Wolff <[email protected]> wrote: > So currently, we have two ways of outputting html - $wgWellFormedXml = > true (The default), outputs html that happens to conform with the > rules of XML. $wgWellFormedXml = false on the other hand, uses more > lax html5 rules to save a few bytes. > > Having two modes of output, feels rather silly to me. Originally I > think this was meant as a feature flag well $wgWellFormedXml=false > stabilized, but it never got turned on, and here we are 7 years later. > > Having $wgWellFormedXml=false increases the complexity of the code, > and not all that many people use it (Notable exception is > translatewiki). I think its important that security critical code be > as simple as possible. Furthermore, there seems to be very little > benefit to having the second mode (After you account for gzip, saving > a few bytes from writing <img> instead of <img/> really doesn't > matter, imo) > > With that in mind, I would like to propose killing $wgWellFormedXml = > false; I'm not so much attached to the true mode (Although I do feel > the true mode is significantly more sane), as I just simply want there > to be a single mode. Putting the default to false was vetoed in > T52040, so I think that true would be the best choice to go with going > forward if we are getting rid of one of the modes. > > If there are aspects of the other mode that people really want, then I > think we should simply merge that in to the default behavior instead > of having two separate modes. > > See gerrit patch https://gerrit.wikimedia.org/r/286495 I would > appreciate everyone's feedback. > > Thanks, > Brian > > _______________________________________________ > Wikitech-l mailing list > [email protected] > https://lists.wikimedia.org/mailman/listinfo/wikitech-l _______________________________________________ Wikitech-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikitech-l
