- From: <stephen_holmes@lionbridge.com>
- Date: Tue, 13 Apr 99 21:43:08 GMT
- To: <www-international@w3.org>, <taka@netscape.com>, <jesse@Novonyx.COM>
- Cc: <www-international@w3.org>
Obviously, the topic of multilingual forms is _still_ a hot topic, but web forms, to be truly universal, need to be able to accept the input of any characters regardless of the forms original encoding. I think that the current solutions all succeed and fail to certain degrees. You know, I might be an Irish guy using Irish extended characters but logging data to a Japanese site originally encoded in sjis/EUC etc. I'd still like my original characters retained in any return contact I have without transformation problems. Personally, I've had the most success with this stuff by _not_ specifying the encoding at all and using a local cookie instead to store the users language preference which I can then interpret and treat accordingly within my HTML scripts. Perhaps the goal should be NOT to specify the encoding of forms, but rather to enforce a Unicode or 10646 scheme for all forms as the ONLY input mechanism, browsers would then be obliged to interface with and convert data from the relevant local IME's to Unicode. Of course, a Mars mission is also planned ;-) Cheers Steve. -----Original Message----- From: <www-international@w3.org > Sent: 13 April 1999 16:04 To: "'taka@netscape.com'" <taka@netscape.com>; Jesse Hall <jesse@Novonyx.COM> Cc: www-international@w3.org Subject: RE: Form response charset Actually this is the question that I am working on too. In other words, if we have <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=x-sjis"> and ,let say, we use Japanese 95. The default input data are in cp932 character set for Japanese Win95. What you are saying is that it converts to UTF-8 somewhere? If so, where? How does it work? Kevin -----Original Message----- From: taka@netscape.com [SMTP:taka@netscape.com] Sent: Tuesday, April 13, 1999 12:44 PM To: Jesse Hall Cc: www-international@w3.org Subject: Re: Form response charset Hi Jesse, One of the solution to your question is to specify charset of your original document. Major browsers send back to server in the character encoding being used in the form. For example, server sends a HTML document like below, <html> <head> <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=x-sjis"> </head> <!-- some form --> </html> browser send inputs in Shift_JIS encoding. If you want to receive it in UTF-8, specify UTF-8 instead of x-sjis. Taka Jesse Hall wrote: > Hello, > > I'm not sure this is the proper forum, but I've searched everywhere I could > think of and couldn't find an answer to my question. If there's a more > appropriate place for me to look/ask, please let me know. > > I'm working on internationalizing a web-based application. One of the > requirements is that it must accept international input via forms. My problem is > that I haven't found a way of determining which character set the information > coming back from the browser is in (e.g. for a INPUT TYPE=TEXT or a TEXTAREA > field). > > I'm using UTF-8 for all the pages I send. The browsers I've tested with handle > this properly. However, what I'm getting back from e.g. a Japanese browser (I've > tried two) running on Japanese Windows is not UTF-8. The best solution from my > point of view is to always get the response in UTF-8, but if there is a way to > determine the charset of the returned data, I can of course do the conversion > myself if necessary. > > TIA, > Jesse Hall > jesse@novonyx.com -- Takayuki Tei mailto:taka@netscape.com http://people.netscape.com/taka/ ldap://ldap.four11.com/gn=Takayuki,mail=taka@netscape.com
Received on Tuesday, 13 April 1999 16:43:55 UTC