Hi Daniel,
2010-08-18 09:24, Daniel Kinzler skrev:
> Andreas Jonsson schrieb:
>
>> Mixing HTML elements with wikitext is a grey area. How the HTML tags
>> in the wikitext interact with the wikitext elements does not seem very
>> well defined. Therefore, I will make up some rules
>> where I try to preserve any legitimate use of html elements, but with
>> some restrictions to avoid some problems:
>>
>> 1. Do not allow html block elements inside wikitext lists. For examle
>> this is no longer allowed:
>>
>> * item1<li> item2
>>
> What does "not allowed" mean, exactly? What happens if the user enters this?
> As
> by the old mantra, any text is valid wikitext.
>
>
I mean that the character sequence "<li>" will not be a token in this
context. (It will become three tokens: SPECIAL[<], WORD[li] and
SPECIAL[>], which should eventually be rendered as <li> by a html
rendering client.) Of course, the lexical scanner will accept any
sequence of characters.
> So, I think it would make more sense to say that html block elements
> *terminate*
> wikitext lists.
>
>
That would be a reasonable alternative. But I think that it is better
to disable html block element tokens, because I don't think that it is a
useful feature to make it possible to terminate a list item with
anything but a newline or end of file. I think that it would just more
confusing for the users.
>> 2. Do not allow table html tags inside wikitext tables, unless opened
>> up by a nested html table, which disables wikitext table tokens
>> until the html table is properly closed:
>>
> ....
>
>> So, we'll get two different kinds of table contexts, which may be
>> arbitrarily nested, but not mixed.
>>
> As long as arbitrary nesting is supported, I'm all for it! Mixing html and
> wiki
> syntax for table elements leads to a mess with the current parser anyway.
>
>
>> *<img [attributes]> Same as<br> except that it is enabled/disabled
>> via a configuration option.
>>
> Additional restrictions may be imposed on any attribute that contains a URL.
>
>
At the moment I'm working on the lexer, which will attach a list of
whatever looks like attributes to the corresponsing token. Filtering
the attribute list will be performed at a higher level. As I understand
it, the attribute list will never affect wether an opening tag should be
treated as a token or not. This is still a br tag:
<br complete *)()()(UF*(*garbage/>
>> *<p [attributes]> Opening tag enables closing tag</p> and disables
>> itself until the end of the current inlined text.<p> opens up a
>> new paragraph,</p> closes the current inlined text.
>>
> Not sure<p> should be disabled after<p>. Most browsers treat<p>...<p> as
> <p>...</p><p>. That makes more sense, I think.
>
>
That's MediaWiki's current behaviour.
<p> foo <p> foo </p>
is rendered as
<p> foo <p> foo </p>
>> * Inlined html elements. These can be used for long term formatting.
>> The context will make sure they are correctly nested, closed on end
>> of inlined text and reopened at beginning of inlined text. They are
>> permanently closed at the corresponding end tag, or at end of
>> article. Variants:
>>
> Do we really want inline formatting to span across blocks? I find that very
> quircky. I think the format should simply end at the end of the block, that's
> it. Interleaved markup is evil.
>
>
That's MediaWiki's current behaviour.
>> * Block html elements. Start and end tags terminate inline text.
>> (They may _not_ be nested inside paragraphs.).
>>
> That is: they *terminate* paragraphs.
>
>
Inline text is not necessarily contained in a paragraph.
>> Inline text inside
>> <ol> and<ul> implies<li>, inlined text inside<dl> implies<dd>,
>>
> fine
>
>
>> inline text inside<div> implies<p>,
>>
> err. whot? no!<p> usually implies margins/padding. if i use<div>foo</div>, i
> generally do not want any margins/padding!
>
>
Sorry, sometimes I confuse html with DocBook, where all inline text must
stand inside <para> tags.
>> inline text inside<table>
>> implies<tbody><tr><td>,<h1>-<h6> disables wikitext block element
>> tokens, in addition to all html block element tokens except the
>> correspondig closing</hX> token.
>>
> What exactly does "disable mean here? Do they get stripped? or displayed
> verbatim?
>
The corresponding tokens are disabled in the lexical scanner.
>> *<pre> disables all html elements and all block elements (both wikitext
>> and html block elements).
>>
> <pre> should disable *all* markup except</pre>. It's actually a lot
> like<nowiki>.
>
> Lines starting with blanks (please include tabs here!), in contrast, become
> pre-formatted, but still allow inline formatting, auto-linking URLs, etc.
>
>
Thanks, I had missed that. I just assumed that they were equivalent.
It seems that block html elements in an indentet line takes precedence:
Preformatted text? <li> No!
Rendered as:
Preformatted text?
<li> No! </li>
That'll require an extra lookahead on all indented lines. *sigh*
>> *<ins> and<del> will be inline if occuring inside inlined text.
>> Otherwise block.
>>
>> *<a> disables wikitext link tokens.
>>
> <a> is not allowed at the moment. I once tried to add support for it, but got
> reverted for technical reasons. We might add it to support RFDa (semantic
> relations).
>
You are right, it isn't. Yippihe! :-)
>> * Tag extensions are treated like<nowiki>; the contents are passed
>> verbatim to the corresponding callback function. The parser may be
>> called recursively if the extension needs to parse wikitext.
>>
> Please note that the HTML returned from tag extensions is, at the moment,
> *not*
> passed verbatim, though it very likely should. See bug 1319, compare bug
> 12974.
>
>
I haven't analysed the tag extensions completely yet. But I assume that
the content isn't touched by the parser, and that if the extension wants
anything inserted into the output stream, it must call the parser
recursively.
> Thanks for your great work!
> -- daniel
>
> _______________________________________________
> Wikitext-l mailing list
> [email protected]
> https://lists.wikimedia.org/mailman/listinfo/wikitext-l
>
>
_______________________________________________
Wikitext-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikitext-l