Re: [whatwg] URL query component

Anne van Kesteren Fri, 20 Apr 2012 05:53:04 -0700

On Fri, 20 Apr 2012 14:37:10 +0200, And Clover <[email protected]> wrote:

On 2012-04-20 09:15, Anne van Kesteren wrote:
Currently browsers differ for what happens when the code point cannotbe encoded.What Gecko does [?%C2%A3] makes the resulting data impossible tointerpret.What WebKit does [?%26%23163%3B] is consistent with form submission. Ilike it.
I do not! It makes the data impossible to recover just as Gecko does...in fact worse, because at least Gecko preserves ASCII. With the WebKitbehaviour it becomes impossible to determine from an pure ASCII string'£' whether the user really typed '€' or '£' into the inputfield.

You have the same problem with Gecko's behavior and multi-byte encodings.That's actually worse, since an erroneous three byte sequence will put themulti-byte decoders off.

It has the advantage of consistency with the POST behaviour, but thatbehaviour is an unpleasant legacy hack which encourages amisunderstanding of HTML-escaping that promotes XSS vulns. I would notlike to see it spread any further than it already has.

It's both GET and POST. So really the only difference here is manuallyconstructed URLs.

Also, I think we should flag all non-utf-8 usage. This is mostly aboutdeciding behavior for legacy content, which will already be broken if itruns into this minor edge case.



--
Anne van Kesteren
http://annevankesteren.nl/

Re: [whatwg] URL query component

Reply via email to