On 2/13/08, Daniel Kinzler <[EMAIL PROTECTED]> wrote:
> No. If a tag-style extension wants to support wiki text, it has to explicitly
> invoke a new parser pass on the text contained between the tags. The text MUST
> NOT be parsed/transformed before being passed to the extension, and what the
> extension returns must not be parsed either (the latter is only partially true
> for the current parser, but i would call that a bug, not a feature - see bug
> 8997).
So, the parse sequence for:
* <ref> '''blah'''</ref>
basically goes:
1. Parse bullet and find <ref>...</ref>
2. Pass <ref> chunk to extension.
3. Extension processes <ref> chunk, calls parser to process the bold
tags, returns something with <b>blah</b>
4. Parser continues on...
> Magic words don't have to have the form __XXX__ - they can be characterized by
> any regular expression. Consider how ISBN and RFC are treated - those are
> magic
> words too... Oh and please consider that the patterns are frequently
> localizable
No they're not. Quite specifically, they're not - the key words (ISBN,
RFC, PMID) are hardcoded into the parser code and not
internationalisable. I call them "magic links" in my grammar.
> (and are thus maintained in mediawiki's messages files): French, for example,
> allows __AUCUNETABLE__ for __NOTOC__. The same goes for #REDIRECT btw: dutch
> allows #DOORVERWIJZING, etc...
That's ok - I'd forgotten that the #REDIRECT word is a magic word though.
> I'm not entirely sure if extensions are free to define magic words using *any*
> pattern, but I think this is so. MagicWord.php is entirely regex-based. Which
> would mean that either your parser will only support some types of magic
> words,
> or it needs a way to hook into the actual grammar.
Yes, as I discussed, there will need to be restrictions on the form of
magic words, which is not a bad thing anyway.
> Oh, and "variables" like {{PAGENAME}} are treated as magic words internally,
> though that wouldn't have to be so. I would probably use the template
> mechanism,
> and simply intercept the use of special names.
I'm a bit unclear on the meaning and current processing of the things
involving curly braces. Can someone help me out here:
* {{template}} - totally handled by preprocessor?
*{{{1}}} - template parameter, totally handled by preprocessor?
*{{PAGENAME}} - "magic" variable? Where is it handled? Does it have to be caps?
*{{foo:blah}} - parser function? Where is it handled?
*{{defaultsort:blah}} - same question
Any others?
Currently I'm handling these:
* __TOC__ etc (magic words)
* #REDIRECT
* ISBN, PMID, RFC (magic links)
Steve
_______________________________________________
Wikitext-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikitext-l