https://bugzilla.wikimedia.org/show_bug.cgi?id=17486
--- Comment #31 from Daniel Friesen <[email protected]> --- (In reply to comment #30) > My comment was on topic simply because the malformed output is caused by > incorrect specification about how distinct content elements can be safely > embedded into each other. > > And the whole topic is about this issue: the basic wiki syntax interacts very > badly with the HTML (or XML) syntax based on *explicit* closure of tags (or > wiki syntaxes). The current parsing rules contradict between each other, and > we > constantly have to find tricks to avoid these issues and incorrect output > (which may parse as valid HTML5 but was in fact not the one intended and will > be wrong XHTML5 anyway). Specification and mixing custom WikiText syntaxes with HTML is irrelevant. We're supposed to fail silently when bad WikiText is used and output valid HTML even when given crap, not output malformed markup. This WikiText: * List item 1. <table class="wikitable"> <tr> <td> Cell 1. </td> </tr> </table> * List item 2. Outputs this: <ul><li> List item 1. <table class="wikitable"> </li></ul> <tr> <td> Cell 1. </td> </tr> </table> <ul><li> List item 2. </li></ul> There's a </li></ul> right after the <table class="wikitable"> it leaves <tr> and <td> elements outside of a table, that's invalid. This but has nothing to do with integrating the WikiText list syntax and HTML table markup. The fix for this issue is simply making sure that the garbage we output for this invalid input is still well-formed markup. Try inserting that garbage output back into a wiki page: <ul><li> List item 1. <table class="wikitable"> </li></ul> <tr> <td> Cell 1. </td> </tr> </table> <ul><li> List item 2. </li></ul> This is essentially the same garbage that the user gives us. But this time the parser outputs: <ul><li> List item 1. <table class="wikitable"> </li></ul> <tr> <td> Cell 1. </td> </tr> </table> <ul><li> List item 2. </li></ul></li> </ul> While there is a minor validity issue in the fact that we have a string of text inside of a <table> but outside of a cell -- fixing that would probably be a separate bug -- that aside the markup is still well formed XML. Tags are properly paired up, same number of each, and they are closed in the correct order. When output into an XHTML5 page parsed with an XML parser this will work and won't give you an XML parse error. -- You are receiving this mail because: You are the assignee for the bug. You are on the CC list for the bug. _______________________________________________ Wikibugs-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikibugs-l
