https://bugzilla.wikimedia.org/show_bug.cgi?id=58316

       Web browser: ---
            Bug ID: 58316
           Summary: Javascript escapes in URLs are not decoded
           Product: MediaWiki
           Version: unspecified
          Hardware: All
                OS: All
            Status: UNCONFIRMED
          Severity: normal
          Priority: Unprioritized
         Component: General/Unknown
          Assignee: [email protected]
          Reporter: [email protected]
    Classification: Unclassified
   Mobile Platform: ---

Created attachment 14056
  --> https://bugzilla.wikimedia.org/attachment.cgi?id=14056&action=edit
requests logged on 2012-06-09 for hour 19:00

Instead of HTML percent encodings, pages are sometimes requested through
Javascript-encoded URLs. The difference is that "\x", rather than the "%"
symbol, is used to indicate the start of an escape sequence. These requests are
not decoded by the Mediawiki software. For example, a request for

https://en.wikipedia.org/w/index.php?title=Robinson_Can%C3%B3

is correctly decoded (the "%C3%B3" is transformed to an accented "o"), whereas
a request for

https://en.wikipedia.org/w/index.php?title=Robinson_Can\xC3\xB3

is not decoded and we're told the page doesn't exist.

As I noted at
https://en.wikipedia.org/wiki/Wikipedia:Redirects_for_discussion/Log/2013_December_9#.5Cx22Weird_Al.5Cx22_Yankovic
there's been a tremendous increase in the amount of this traffic reaching the
WMF projects, from about one request per hour in September 2011 to millions of
requests per day in November 2013.

Perhaps it would be desirable to transform "\x" to "%" before passing URLs to
rawurldecode() so that these requests will reach the intended pages.

-- 
You are receiving this mail because:
You are the assignee for the bug.
You are on the CC list for the bug.
_______________________________________________
Wikibugs-l mailing list
[email protected]
https://lists.wikimedia.org/mailman/listinfo/wikibugs-l

Reply via email to