[whatwg] form charset from Olav Junker Kjær on 2005-04-19 (public-whatwg-archive@w3.org from April 2005)

From: Olav Junker Kjær <olav@olav.dk>
Date: Wed, 20 Apr 2005 00:02:08 +0200
Message-ID: <42657FE0.7090805@olav.dk>

I understand that the _charset_ field is needed in url encoded requests,
since any encoding can be chosen through accept-charset and
there is no other way to know the encoding.

However, is it really the right thing to allow arbitrary encodings of
GET queries in the first place? The official Right Way to encode URLs is 
to use Utf8, and it seems strange to allow a different encoding after 
the question mark.

Also, URLs are supposed to be context independent, e.g. you should be 
able to bookmark a query, send it in a mail and so on. This might be 
problematic if the correct interpretation of the URL is dependent on the 
encoding or the accept-charset attribute on the form in the originating 
page.

Of course we cannot just mandate utf8 always, since there is the issue 
of backwards compatibility. If I'm not mistaken, browsers usually 
urlencode forms using the same charset as the page. I we want to
avoid breakage of server scripts, this should remain the default. 
However, the only legal value in accept-charset should be utf8 when the 
method is GET.

regards
Olav Junker Kj?r

Received on Tuesday, 19 April 2005 15:02:08 UTC