W3C home > Mailing lists > Public > www-international@w3.org > April to June 1999

RE: Form response charset

From: <stephen_holmes@lionbridge.com>
Date: Tue, 13 Apr 99 21:43:08 GMT
Message-Id: <9904139240.AA924036229@lbt-smtp1.lionbridge.com>
To: <www-international@w3.org>, <taka@netscape.com>, <jesse@Novonyx.COM>
Cc: <www-international@w3.org>

Obviously, the topic of multilingual forms is _still_ a hot topic, but web
forms, to be truly universal, need to be able to accept the input of any
characters regardless of the forms original encoding.   I think that the current
solutions all succeed and fail to certain degrees.   You know, I might be an
Irish guy using Irish extended characters but logging data to a Japanese site 
originally encoded in sjis/EUC etc.  I'd still like my original characters
retained in any return contact I have without transformation problems.

Personally, I've had the most success with this stuff by _not_ specifying the
encoding at all and using a local cookie instead to store the users language
preference which I can then interpret and treat accordingly within my HTML

Perhaps the goal should be NOT to specify the encoding of forms, but rather to
enforce a Unicode or 10646 scheme for all forms as the ONLY input mechanism,
browsers would then be obliged to interface with and convert data from the
relevant local IME's to Unicode.  Of course, a Mars mission is also planned ;-)


-----Original Message-----
From: <www-international@w3.org > 
Sent: 13 April 1999 16:04
To: "'taka@netscape.com'" <taka@netscape.com>; Jesse Hall
Cc: www-international@w3.org
Subject: RE: Form response charset 

Actually this is the question that I am working on too.
In other words, if we have
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=x-sjis">
and ,let say, we use Japanese 95.
The default input data are in cp932 character set for Japanese Win95. What
you are saying is that it converts to UTF-8 somewhere? If so, where? How
does it work?


        -----Original Message-----
        From:        taka@netscape.com [SMTP:taka@netscape.com]
        Sent:        Tuesday, April 13, 1999 12:44 PM
        To:        Jesse Hall
        Cc:        www-international@w3.org
        Subject:        Re: Form response charset

        Hi Jesse,

        One of the solution to your question is to specify charset of your
original document.
        Major browsers send back to server in the character encoding being
used in the form.
        For example, server sends a HTML document like below,

        <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=x-sjis">
        <!-- some form -->

        browser send inputs in Shift_JIS encoding.  If you want to receive
it in UTF-8,
        specify UTF-8 instead of x-sjis.


        Jesse Hall wrote:

        > Hello,
        > I'm not sure this is the proper forum, but I've searched
everywhere I could
        > think of and couldn't find an answer to my question. If there's a
        > appropriate place for me to look/ask, please let me know.
        > I'm working on internationalizing a web-based application. One of
        > requirements is that it must accept international input via forms.
My problem is
        > that I haven't found a way of determining which character set the
        > coming back from the browser is in (e.g. for a INPUT TYPE=TEXT or
        > field).
        > I'm using UTF-8 for all the pages I send. The browsers I've tested
with handle
        > this properly. However, what I'm getting back from e.g. a Japanese
browser (I've
        > tried two) running on Japanese Windows is not UTF-8. The best
solution from my
        > point of view is to always get the response in UTF-8, but if there
is a way to
        > determine the charset of the returned data, I can of course do the
        > myself if necessary.
        > TIA,
        > Jesse Hall
        > jesse@novonyx.com

        Takayuki Tei
        mailto:taka@netscape.com http://people.netscape.com/taka/
Received on Tuesday, 13 April 1999 16:43:55 UTC

This archive was generated by hypermail 2.3.1 : Wednesday, 21 September 2016 22:37:18 UTC