2010-08-28 14:32, Daniel Friesen skrev:
> Andreas Jonsson wrote:
>    
>> ...
>> Trying to reproduce this behavior in a new parser would, of course, be
>> insane.  In fact, the current MediaWiki parser does not seem to parse
>> links in linear time using linear amount of memory.  My test server
>> failed to process a preview of an article consisisting of about 24000
>> links on the form [[a]]. It was working hard before it, I
>> guess, ran out of memory.  As a comparison it parsed over 38000 italic
>> a's, ''a'', without problems.
>>
>> So, what is the reasonable thing to do?  First of all it should be
>> pointed out that block elements are not allowed inside link text:
>>
>> http://www.w3.org/TR/2002/REC-xhtml1-20020801/dtds.html#dtdentry_xhtml1-strict.dtd_a
>>
>> This suggests that any sane wikitext should not allow a link to
>> continue past the end of the inlined text where it is located.  Even
>> better is to say that the sequence [[Link| always opens up a new link
>> and that 'end of inline text' will implicitly close the link if it is
>> still open.  That will not require any lookahead to parse.  It would
>> be consistent with the format parsing to only allow it to run to the
>> end of line, though.  Also, currently paragraphs and list elements
>> aren't rendered inside link text, unless enclosed or preceeded by a
>> table.  So, unless tables inside link text is a widely used feature,
>> such a change might not break that many pages.
>>
>> /Andreas
>>
>>      
> Keep in mind that MediaWiki is switching to html5. As the browsers don't
> even parse according to xhtml rules, and the xhtml doctype means nothing
> but a hint to validators (which not every page even validates properly
> anyways) which aren't essential, I don't believe xhtml rules -- with the
> exception of valid xml output -- are valid if they are retracted by
> html5 (which attempts to define html parsing how it should be, based on
> how it already is, iirc).
> In this case, html5 defines<a>  as "transparent content", block elements
> are valid inside of an<a>  if they are valid without the<a>  there. So
> as long as you don't output the<p>, as you would do anyways if you got
> the<div>  directly, then<a ...><div>...</div></a>  is valid.
>
> Just making note...
>
>    
That's very interesting.   I didn't know that.

/Andreas


_______________________________________________
Wikitext-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikitext-l

Reply via email to