[Bug 35746] {{PAGENAME}} must not escape special chars, otherwise it makes {{#ifeq:}} unusable

bugzilla-daemon Wed, 29 Jan 2014 05:23:21 -0800

https://bugzilla.wikimedia.org/show_bug.cgi?id=35746


Philippe Verdy <[email protected]> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |[email protected]

--- Comment #8 from Philippe Verdy <[email protected]> ---
The safest way to compare page names is to pass them BOTH through
{{PAGENAMEE|pagename}}, or BOTH to {{PAGENAMEE|pagename}}. If you want to also
compare their namespaces, pass both pagenames in parameter to
{{FULLPAGENAME|pagename}} so that the given pagename won't have its namespace
parsed and removed.

Note that these functions will also resolve relative paths in subpages and
FULLPAGENAME(E) will also resolve the namespace.

So:
    {{#ifeq: {{PAGENAME}}|Q & A|true|false}}
will always be false on every page, but the following will work:
    {{#ifeq: {{PAGENAME}}|{{PAGENAME|Q & A}}|true|false}}
as it will return "true" on the expected page.

With full page names where you also check the namespace:
    {{#ifeq: {{FULLPAGENAME}}|{{FULLPAGENAME|Q & A}}|true|false}}
will also return true but only in the main namespace (it will be false on a
Category page named "Category:Q & A", because the second parameter of "#if"
gets the full page name of page "Q & A" in te main namespace).

-----

In summary:

* {{(FULL|BASE|SUB)PAGENAMEE|...}} return URL-encoded names
* {{(FULL|BASE|SUB)PAGENAME|...}} return HTML-encoded names

There's NO function in MediaWiki that returns the raw pagename.

-----

But note:
    {{(FULL|BASE|SUB)PAGENAMEE|...}}
is also different from
    {{URLENCODE:{{(FULL|BASE|SUB)PAGENAME|...}}}}

Because in the later case, URLENCODE will take in parameter an HTML-encoded
name, so the result will be double-encoded, where HTML entities (containing the
character & # ;) and SPACEs will be URL-encoded using %nn and +.

But in the first case the MediaWiki-specific URL-encoding performed by
PAGENAMEE is different than standard URL-encoding (it does not generate "+" for
spaces, but generates underscores).

So:

1. "{{PAGENAMEE|Q & A}}"
   returns in fact "Q_%26_A"
2. "{{PAGENAME|Q & A}}"
   returns in fact "Q &#38; A"
3. "{{URLENCODE:{{PAGENAME|Q & A}}}}"
   returns in fact at least this: "Q+%26%2338;+A"
   I don't know if URLENCODE also recodes the semicolon,
   if so the result will be instead: "Q+%26%2338%2B+A"
   In all cases this will be different from the result of case 1 !!!

-----

This strange behavior means that there are some characters "permitted" in URLs
to MediaWiki sites that are transformed in a fery strange way, such as:

1. http://www.mediawiki.org/wiki/Q & A

      not directly a valid URL, but the browser transforms it to
      URL-encoding of UTF-8 and requests:

   http://www.mediawiki.org/wiki/Q%20&%20A

       the server all accept to load the page name "Q & A"

2. http://www.mediawiki.org/wiki/Q+%26%2338%2B+A

       the server parses this URL as containing an URL-encoded pagename,
       so it first URL-decodes it as:

            Q &#38; A

       the server will then parse the URL and will think it contains an
       anchor, it will attempt to load a page named only "Q &",
       with the anchor "38; A" dropped !

3. Valid page names may contain isolated ampersand or ampersands ad valdi
characters in pagenames (internally they are HTML-encoded if you query their
{{PAGENAME}}) but some sequences will generate errors,
such as "&amp;", but "a amp;" will be accepted...

All this is completely inconsistant, but this time this does not occur in
parser functions, but at the server API level when handling incoming HTTP(S)
requests that may, or may not, be HTML-encoded, when the HTTP-standard says
that URLs should be ONLY URL-encoded ! The server also performs such
double-decoding when resolving requests.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
_______________________________________________
Wikibugs-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l

[Bug 35746] {{PAGENAME}} must not escape special chars, otherwise it makes {{#ifeq:}} unusable

Reply via email to