RE: what should the charset be in the response to the server

Modern browsers all support HTML 4.0, which provides the "accept-charset"
attribute for FORM elements:
 http://www.w3.org/TR/html4/interact/forms.html#adef-accept-charset 

A page may thus be designed so that POST data is of a specific encoding. 

IE5+ will honor this attribute in some but not all cases. It will use it to
"upgrade" the encoding from a mbcs format (e.g. ShiftJIS, cp 1252, etc) to
Unicode (e.g. utf-8) but not the reverse (which is annoying). Mozilla always
honors it. Opera does not. I would expect Konqueror/Safari to support it as
well, although I am not sure. Old browsers (Netscape 4.x, IE 4, etc) does
not. That was the situation the last time I checked at least.

If there is no "accept-charset" attribute, then browsers will use the
encoding of the page from which the POST data is being posted from (e.g. use
the HTTP header, META tags, auto detect, etc).

A common trick to deal with the idiosyncrasies between browsers (e.g. lack
of support for 'accept-charset') is to include a hidden field in the form
with a known value. Since the value is know, it's easy enough for a CGI
script to determine the encoding format of the POST data. The value is
simply compared with the result of using the possible encoding formats.


Regards,
em2 Solutions
Michael Jansson

> -----Original Message-----
> From: Addison Phillips [mailto:aphillips@webmethods.com]
> Sent: Monday, July 21, 2003 8:08 PM
> To: Sinha, Raj (Raj)
> Cc: www-international@w3.org
> Subject: Re: what should the charset be in the response to the server
> 
> 
> 
> Hi Raj,
> 
> The browser always sends data back to the server in the 
> charset of the 
> page. That is, if the browser thinks the page is UTF-8, it 
> will encode 
> its response using UTF-8.
> 
> I use the word "thinks" because, of course, the browser must 
> interpret 
> the encoding of the page from the HTTP header and any META tag in the 
> file. In some cases it must detect the encoding algorithmically. So 
> whatever charset the browser ends up interpreting the page 
> using is the 
> encoding is uses for a response (either GET or POST).
> 
> Hope that helps.
> 
> Best Regards,
> 
> Addison
> 
> -- 
> 
> Addison P. Phillips
> Director, Globalization Architecture
> webMethods, Inc.
> 
> Internationalization is an architecture. It is not a feature.
> 
> [Chair, W3C-I18N-WG, Web Services Task Force]
> http://www.w3.org/International/ws
> 
> 
> 
> 

Received on Monday, 21 July 2003 14:54:56 UTC