[Bug 24845] Merge <form> and URL error modes?

https://www.w3.org/Bugs/Public/show_bug.cgi?id=24845

Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |xn--mlform-iua@xn--mlform-i
                   |                            |ua.no

--- Comment #7 from Leif Halvard Silli <xn--mlform-iua@xn--mlform-iua.no> ---
(In reply to Henri Sivonen from comment #5)
> (In reply to Simon Pieters from comment #4)

> > Gecko switches to utf-8 for the whole URL and gets ?%C3%A5.
> 
> Hmm. Switching the encoding when one non-representable character is added
> doesn't seem like a good idea to me, especially if other browsers don't do
> the same. CCing bz in the hope of getting background info.

Well, compared with what they do for erroneous URLs with *representable*
characters, then both Firefox and Webki/Blink switch the encoding when one 
non-representable character is added. They just do it different ways.

Firefox just follows the normal procedure of representing code points higher
than U+009F as UTF-8 percent-encoded characters. Webkit/BLink do the same - but
only for the *representable* characters.

The issue here is, I guess, 'storing': The percent-encoding is decoded e.g.
when storing a form.  And so, in Webkit/Blink, ?%26%23229%3B becomes ?&#229;,
which is compatible even with Cyrillic encodings.

For IE, the character is stored as representable, and therefore wrong,
character.

For Firefox, unless it performs some extra encoding step after the decoding,
the percent-encoded character is probably stored as UTF-8 encoded characters
read through a non-UTF parser

For storing a form, the Webkit/Blink behavior seems more fruitful. Inside a Web
page, the Firefox method might be better?

-- 
You are receiving this mail because:
You are on the CC list for the bug.

Received on Friday, 4 April 2014 05:16:35 UTC