https://bugzilla.wikimedia.org/show_bug.cgi?id=58316
--- Comment #3 from Derk-Jan Hartman <[email protected]> --- In summary: * Entries in the log of apache that look like: Robinson_Can\xC3\xB3 which is a UTF-8 encoded (Likely a representation of the not percent encoded request containing Robinson_Canó, [possibly even an IRI request?]) * Log entries are NOT canonical on this front. A request for Robinson_Canó is logged differently then a request for Robinson_Can%C3%B3. * The statistics of stats.grok.se might not handle these properly (collating them, ignoring them, or just not accessible ?) * Someone else made a tool to detect red links, that does make the \x entries accessible/visible. * Someone is making mass redirects of \x entries to what they consider to be 'proper' entries. This seems to cause effect in the statistics, but I would say that if the statistics/tools are broken, you are only influencing the statistics most likely, not per se actually fixing something * There seems to have been a large increase of these kinds of requests (newer browsers or google/bing.com changing their defaults can easily account for this). * You cannot input a utf-8 sequence in the url field of a browser (because there is no need for this, you would just input ó). * People can't figure out who is wrong and who is right. Does that sum it up a bit ? -- You are receiving this mail because: You are the assignee for the bug. You are on the CC list for the bug. _______________________________________________ Wikibugs-l mailing list [email protected] https://lists.wikimedia.org/mailman/listinfo/wikibugs-l
