RE: Form response charset

Obviously, the topic of multilingual forms is _still_ a hot topic, but web
forms, to be truly universal, need to be able to accept the input of any
characters regardless of the forms original encoding.   I think that the current
solutions all succeed and fail to certain degrees.   You know, I might be an
Irish guy using Irish extended characters but logging data to a Japanese site 
originally encoded in sjis/EUC etc.  I'd still like my original characters
retained in any return contact I have without transformation problems.

Personally, I've had the most success with this stuff by _not_ specifying the
encoding at all and using a local cookie instead to store the users language
preference which I can then interpret and treat accordingly within my HTML
scripts.

Perhaps the goal should be NOT to specify the encoding of forms, but rather to
enforce a Unicode or 10646 scheme for all forms as the ONLY input mechanism,
browsers would then be obliged to interface with and convert data from the
relevant local IME's to Unicode.  Of course, a Mars mission is also planned ;-)

Cheers
Steve.





-----Original Message-----
From: <www-international@w3.org > 
Sent: 13 April 1999 16:04
To: "'taka@netscape.com'" <taka@netscape.com>; Jesse Hall
<jesse@Novonyx.COM>
Cc: www-international@w3.org
Subject: RE: Form response charset 


Actually this is the question that I am working on too.
In other words, if we have
<META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=x-sjis">
and ,let say, we use Japanese 95.
The default input data are in cp932 character set for Japanese Win95. What
you are saying is that it converts to UTF-8 somewhere? If so, where? How
does it work?

Kevin

        -----Original Message-----
        From:        taka@netscape.com [SMTP:taka@netscape.com]
        Sent:        Tuesday, April 13, 1999 12:44 PM
        To:        Jesse Hall
        Cc:        www-international@w3.org
        Subject:        Re: Form response charset

        Hi Jesse,

        One of the solution to your question is to specify charset of your
original document.
        Major browsers send back to server in the character encoding being
used in the form.
        For example, server sends a HTML document like below,

        <html>
        <head>
        <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=x-sjis">
        </head>
        <!-- some form -->
        </html>

        browser send inputs in Shift_JIS encoding.  If you want to receive
it in UTF-8,
        specify UTF-8 instead of x-sjis.

        Taka


        Jesse Hall wrote:

        > Hello,
        >
        > I'm not sure this is the proper forum, but I've searched
everywhere I could
        > think of and couldn't find an answer to my question. If there's a
more
        > appropriate place for me to look/ask, please let me know.
        >
        > I'm working on internationalizing a web-based application. One of
the
        > requirements is that it must accept international input via forms.
My problem is
        > that I haven't found a way of determining which character set the
information
        > coming back from the browser is in (e.g. for a INPUT TYPE=TEXT or
a TEXTAREA
        > field).
        >
        > I'm using UTF-8 for all the pages I send. The browsers I've tested
with handle
        > this properly. However, what I'm getting back from e.g. a Japanese
browser (I've
        > tried two) running on Japanese Windows is not UTF-8. The best
solution from my
        > point of view is to always get the response in UTF-8, but if there
is a way to
        > determine the charset of the returned data, I can of course do the
conversion
        > myself if necessary.
        >
        > TIA,
        > Jesse Hall
        > jesse@novonyx.com

        --
        Takayuki Tei
        mailto:taka@netscape.com http://people.netscape.com/taka/
        ldap://ldap.four11.com/gn=Takayuki,mail=taka@netscape.com
        

Received on Tuesday, 13 April 1999 16:43:55 UTC