2010-08-28 14:32, Daniel Friesen skrev: > Andreas Jonsson wrote: > >> ... >> Trying to reproduce this behavior in a new parser would, of course, be >> insane. In fact, the current MediaWiki parser does not seem to parse >> links in linear time using linear amount of memory. My test server >> failed to process a preview of an article consisisting of about 24000 >> links on the form [[a]]. It was working hard before it, I >> guess, ran out of memory. As a comparison it parsed over 38000 italic >> a's, ''a'', without problems. >> >> So, what is the reasonable thing to do? First of all it should be >> pointed out that block elements are not allowed inside link text: >> >> http://www.w3.org/TR/2002/REC-xhtml1-20020801/dtds.html#dtdentry_xhtml1-strict.dtd_a >> >> This suggests that any sane wikitext should not allow a link to >> continue past the end of the inlined text where it is located. Even >> better is to say that the sequence [[Link| always opens up a new link >> and that 'end of inline text' will implicitly close the link if it is >> still open. That will not require any lookahead to parse. It would >> be consistent with the format parsing to only allow it to run to the >> end of line, though. Also, currently paragraphs and list elements >> aren't rendered inside link text, unless enclosed or preceeded by a >> table. So, unless tables inside link text is a widely used feature, >> such a change might not break that many pages. >> >> /Andreas >> >> > Keep in mind that MediaWiki is switching to html5. As the browsers don't > even parse according to xhtml rules, and the xhtml doctype means nothing > but a hint to validators (which not every page even validates properly > anyways) which aren't essential, I don't believe xhtml rules -- with the > exception of valid xml output -- are valid if they are retracted by > html5 (which attempts to define html parsing how it should be, based on > how it already is, iirc). > In this case, html5 defines<a> as "transparent content", block elements > are valid inside of an<a> if they are valid without the<a> there. So > as long as you don't output the<p>, as you would do anyways if you got > the<div> directly, then<a ...><div>...</div></a> is valid. > > Just making note... > > That's very interesting. I didn't know that.
/Andreas _______________________________________________ Wikitext-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikitext-l
