W3C home > Mailing lists > Public > www-international@w3.org > January to March 2002

RE: How browser sents UTf-8 data in request

From: Martin Duerst <duerst@w3.org>
Date: Wed, 20 Feb 2002 14:03:50 +0900
Message-Id: <4.2.0.58.J.20020220140206.03686678@localhost>
To: "souravm" <souravm@infy.com>, "Yung-Fong Tang" <ftang@netscape.com>
Cc: <www-international@w3.org>
At 08:54 02/02/20 +0530, souravm wrote:
>Hi Young,
>
>1. Regarding the point one encoding for Input data can you tell me what
>exactly happened then. That is what will be the input encoding for the
>text box ? What are the conversions happened in between a japanese
>string is typed and it is shown in the text box ? I'm totally confused
>regarding this.
>
>2. Secondly, my forms encoding is UTF-8 (thi is set as content type when
>the for is sent as the response from a previous request)
>
>3. My forms encoding is UTF-8.
>Now why should it depend on browser type ? If a browser supports UTF-8
>will it not supposed to to this conversion ?
>getCharcaterEncodingType is a method of HTTPServletRequest.

If your page is in UTF-8, the result should come back in UTF-8,
independent of browsers. The only exception is browsers that
don't support UTF-8 (current minor ones or very old ones).

Regards,    Martin.



>Regards,
>Sourav
>
>-----Original Message-----
>From: Yung-Fong Tang [mailto:ftang@netscape.com]
>Sent: Tuesday, February 19, 2002 10:10 PM
>To: souravm
>Cc: www-international@w3.org
>Subject: Re: How browser sents UTf-8 data in request
>
>
>
>
>souravm wrote:
>
> >Hi All,
> >
> >I've a doubt regarding browser's working
> >
> >Let us assume that I've a HTML form shown in a browser. The response
> >which created this form had contect type set as UTF-8 at the header.
> >Hence, if I check the emcoding through the tool bar of browser it is
> >coming as UTF-8.
> >This browser is running on Windows 2K whose current locale is Japanese.
> >The Windows 2k has IME support.
> >Now if I enter a japanese string in one text box of this form and
>submit
> >the form my understanding is -
> >1. The input data will be actually in Shift_JIS (or the codepage used
> >for Japanese locale by the Windows 2K).
> >
>how can you know which encoding is for the "input data" ? untill the
>data is store in somewhere, you don't know what the encoding IS. Using a
>
>Japanese locale under windows 2K only mean the ACP is in Shift_JIS. It
>does not mean the Input Method is communicate with the text box in
>Shift_JIS neither mean the text box is in Shift_JIS.
>
> >
> >2. The browser will convert this string from Shift_JIS to UTF-8 before
> >sending it to the server.
> >
>That is because your FORM is in UTF-8, right ?
>
> >
> >3. In the server if I call the method getCharacterEncodingType of
> >request object it will show me UTF-8.
> >
> >Can anyone please verify whether above conclusions/understandings are
> >proper or not ?
> >
>1. too many variables here.
>a. what is the encoding of your FORM? shift_jis?
>b. which browser are you using ? IE3 ? IE4? IE5 ? IE5 on Mac? Netscape
>1.x? Netscape 2.x ? Netscape 3.x? Netscape 4.x? Netscape 6.x? Opera ?f
>c. what is getCharacterEncodingType ??? is that part of a particular
>software package ?
>
> >
> >
> >Regards,
> >Sourav
> >
Received on Wednesday, 20 February 2002 00:40:16 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 2 June 2009 19:16:58 GMT