On 02/05/12 03:24, Victor Vasiliev wrote:
> I am a bit leery though about the part where you suggest that
> name-value arguments ({{#invoke:module|func|param=value}}) should be
> parsed by engine, not the script. Don't you have to expand those
> arguments in order to parse them, hence making any form of
> lazy-expanding impossible?
No, you don't have to expand the arguments in order to extract equals
signs for name/value pairs. The equals signs are already identified by
the preprocessor's parser, for the purposes of lazy expansion of
template arguments. See PPFrame::newChild() and the implementation of
the #switch parser function.
[...]
> This is the part which I strongly oppose. Providing direct
> preprocessor access to Lua scripts is a bad idea. There are two key
> reasons for this:
> 1. Preprocessor is slow.
We can limit the input size, or temporarily reduce the general parser
limits like post-expand include size and node count. We can also hook
into PPFrame::expand() to periodically check for a Lua timeout, if
that is necessary.
The preprocessor is slow now, it won't become slower by allowing Lua
to call it.
> 2. You would have to work out many very subtle issues with time out
> and nested Lua scripts. This includes timeout subtleties caused by the
> preprocessor slowness (load a slow template, and given the small Lua
> time limit, it will cause PHP to show a fatal error due to emergency
> timeout; even if you fix it, the standalone version uses ulimit, and
> it may be more difficult to fix).
The scenario you give in brackets will not happen. If a Lua timeout
occurs when the parser is executing, the Lua script will terminate
when the parser returns control to it. The timeout is not missed.
It doesn't matter if there are several levels of parser/Lua recursion
when a timeout occurs. LuaSandbox is able to unwind the stack efficiently.
The emergency timeout mechanism is functionally equivalent to PHP's
request timeout, so the emergency timeout can probably just be
infinite, and we can rely on the request timeout to terminate
long-running parse requests, as we do now. We could have a Lua script
time limit of a few seconds, and a request timeout of 3 minutes.
> Now, let me go through your suggested use cases and propose some alternatives:
>
> 1. As an alternative to a string literal, to include snippets of
> wikitext which are intended to be editable by people who don't know
> Lua.
> I think it would be in fact better if you provided an interface for
> getting unprocessed wikitext. Or a preprocessor DOM. Preprocessed text
> makes it is difficult to combine human-readable and machine-readable
> versions.
Maybe you are thinking of some sort of virtual wikidata system
involving extracting little snippets of text from infobox invocations
or something. I am not. I would rather use the real wikidata for that.
I am talking about including large, wikitext-formatted chunks of
content language.
> 2. During migration, to call complex metatemplates which have not yet
> been ported to Lua, or to test migrated components independently
> instead of migrating all at once.
> That would eventually lead them to becoming permanent. Bugzilla quips,
> an authoritative reference on Wikimedia practices, says that
> "temporary solutions have a terrible habit of becoming permanent,
> around here". Hence I would suggest that we avoid the temptation in
> first place.
I don't think it's morally wrong to provide a migration tool.
Migration will be a huge task, and will continue for years. People who
migrate metatemplates to Lua will need lots of tools.
> 3. To provide access to miscellaneous parser functions and variables.
> Now, this is a really bad idea. It is like making a scary hack an
> official way to do things. It actually defies the first design
> principle you state. preprocess( "{{FULLPAGENAME}}" ) is not only much
> more uglier than using appropriate API like mw.page.name(), it is also
> a one of the slowest ways to do this. I have benchmarked it, and it is
> actually ~450 times slower than accessing the title object directly.
> Lua was (and is) meant to improve the readability of templates, not to
> clutter them with stuff like articlesNum = tonumber( preprocess(
> "{{NUMBEROFARTICLES:R}}" ) ).
> Solution: proper API would do the job (actually I am currently working on it).
We can provide an API for such things at some point in the future. I
am not very keen on just merging whatever interface you are privately
working on, without any public review.
I am publishing my proposed interface before I write the code for it,
so that I can respond to the comments on it without appearing to be
too invested in any given solution. I wish that you would occasionally
do the same. Rewriting code that you've spent many hours on can be
emotionally difficult. Perhaps that's why you've made no more changes
to ustring.c despite the problems with its interface.
> 4. To allow Lua to construct tag invocations, such as <ref> and <gallery>.
> We could make a #tag-like function to do this, just as we do with
> parser functions.
>
> I feel myself much more comfortable with the original return {expand =
> true} idea, which causes the wikitext to be expanded in the new
> Scribunto call frame.
That would lead to double-expansion in cases where text derived from
input arguments need to be concatenated with wikitext to be expanded.
Consider:
return {
expand = true,
text = formatHeader( frame.args.gallery_header ) .. '\n' ..
'<gallery>' .. images .. '</gallery>' }
> I am a bit puzzled about the "always use named arguments scheme" part,
> because it is not how the standard Lua library works.
It gives flexibility for future development. That was not a core
principle driving the design of the standard Lua library.
-- Tim Starling
_______________________________________________
Wikitech-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikitech-l