On 2012-04-20 14:37, And Clover wrote:
On 2012-04-20 09:15, Anne van Kesteren wrote:
Currently browsers differ for what happens when the code point cannot
be encoded.
What Gecko does [?%C2%A3] makes the resulting data impossible to
interpret.
What WebKit does [?%26%23163%3B] is consistent with form submission. I
like it.

I do not! It makes the data impossible to recover just as Gecko does...
in fact worse, because at least Gecko preserves ASCII. With the WebKit
behaviour it becomes impossible to determine from an pure ASCII string
'£' whether the user really typed '€' or '£' into the input
field.

It has the advantage of consistency with the POST behaviour, but that
behaviour is an unpleasant legacy hack which encourages a
misunderstanding of HTML-escaping that promotes XSS vulns. I would not
like to see it spread any further than it already has.

+1

Indeed.

I think this is a case where you want to fail early (for some value of "fail"); so maybe substituting with "?" makes most sense.

Do any servers *expect* the Webkit behavior? If they do so, why don't they just fix the pages they serve to use UTF-8 to get consistent behavior throughout?

Best regards, Julian

Reply via email to