--- Comment #6 from Philippe Verdy <verd...@wanadoo.fr> 2010-10-10 11:44:48 UTC
Related bug #16778
Seems to be general to various parser functions, like:
whose parameters are not properly filtered out of their "nowiki" sections where
they should, and where these nowiki sections will NEVER be returned as these
parameters are only used by the parser function itself which will compute and
generate something else.
It is more complex to handle for padleft: and padright: (except for its length
parameter), because parts of their parameter will be returned that should
preserve some nowiki markup.
Note: Do we really need to generate so long UNIQ ids for <nowiki> sections ?
Should they simply be stripped out completely or just replaced by a single
control character (if they are just there to block whitespace trimming of
template parameters) ? After all the final HTML tidy step will automagically
remove this control which is invalid in HTML.
And if the only thing that the UNIQ id prevents is to avoid trimming in
successive passes, isn't there a better way to represent whitespaces that
should be preserved ?
* For example trimmable whitespaces would remain as SPACE, TAB or NEWLINE, all
other untrimmable whitespaces would be encoded temporarily as some other C0 or
C1 controls (such as \x0C for untrimmable SPACE, \x8A for untrimmable NEWLINE,
\x89 for untrimmable TAB), and converted back to the whitespaces just before
the HTML generation step at end of parsing, using a very fast byte-for-byte
subtitution (that could then reuse the same string buffer).
* It would simplify a lot the implementation of padleft: and padright: (when
they parse UTF-8 characters, they just have to skip over a single byte \x06
just like they skip over UTF-8 sequences...)
Configure bugmail: https://bugzilla.wikimedia.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.
You are on the CC list for the bug.
Wikibugs-l mailing list