Andreas Jonsson wrote: > ... > Trying to reproduce this behavior in a new parser would, of course, be > insane. In fact, the current MediaWiki parser does not seem to parse > links in linear time using linear amount of memory. My test server > failed to process a preview of an article consisisting of about 24000 > links on the form [[a]]. It was working hard before it, I > guess, ran out of memory. As a comparison it parsed over 38000 italic > a's, ''a'', without problems. > > So, what is the reasonable thing to do? First of all it should be > pointed out that block elements are not allowed inside link text: > > http://www.w3.org/TR/2002/REC-xhtml1-20020801/dtds.html#dtdentry_xhtml1-strict.dtd_a > > This suggests that any sane wikitext should not allow a link to > continue past the end of the inlined text where it is located. Even > better is to say that the sequence [[Link| always opens up a new link > and that 'end of inline text' will implicitly close the link if it is > still open. That will not require any lookahead to parse. It would > be consistent with the format parsing to only allow it to run to the > end of line, though. Also, currently paragraphs and list elements > aren't rendered inside link text, unless enclosed or preceeded by a > table. So, unless tables inside link text is a widely used feature, > such a change might not break that many pages. > > /Andreas > Keep in mind that MediaWiki is switching to html5. As the browsers don't even parse according to xhtml rules, and the xhtml doctype means nothing but a hint to validators (which not every page even validates properly anyways) which aren't essential, I don't believe xhtml rules -- with the exception of valid xml output -- are valid if they are retracted by html5 (which attempts to define html parsing how it should be, based on how it already is, iirc). In this case, html5 defines <a> as "transparent content", block elements are valid inside of an <a> if they are valid without the <a> there. So as long as you don't output the <p>, as you would do anyways if you got the <div> directly, then <a ...><div>...</div></a> is valid.
Just making note... -- ~Daniel Friesen (Dantman, Nadir-Seen-Fire) [http://daniel.friesen.name] _______________________________________________ Wikitext-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikitext-l
