- From: A. Vine <avine@eng.sun.com>
- Date: Thu, 15 Apr 1999 10:53:46 -0700
- To: Jesse Hall <jesse@Novonyx.COM>
- Cc: www-international@w3.org
Just to add to the info already posted: How I handle this is to pass a hidden value which is the charset named in the META tag, e.g. <HTML> <HEAD> <META HTTP-EQUIV="Content-Type" CONTENT="text/html; charset=UTF-8"> </HEAD> <BODY> <FORM ACTION="http://mywebserver/cgi-bin/mycgiprogram" METHOD="POST"> <INPUT TYPE="hidden" NAME="myformcharset" VALUE="UTF-8"> </FORM> </BODY> </HTML> But know this: Input text fields are handled by the _native_ system. That is, even though I'm on an en_US system, I can look at Japanese Web pages because I have the appropriate fonts defined to my browsers. However, if I want to type text into a Japanese (EUC-JP) form, I cannot type Japanese so that it _looks_ like Japanese. If I know the Latin1 (ISO-8859-1) equivalents for the Japanese I want to type, I can fake the system out, type the equivalents, and the browser will convert the data into the Japanese charset (EUC-JP) in this case. By the same token, if you pre-fill the input field with Japanese text, it will look like garbage if the native system doesn't handle Japanese. But once you submit the form, the data will be OK. I've seen Netscape and IE both behave this way. One more interesting note - if you translate ALT text into anything besides Latin1, as a tool tip (the pop-up yellow box) it will be garbage. In Netscape (this is 4.0x) if the image doesn't display at all, the non-Latin1 ALT text displays properly; but in IE 4.x, the ALT text still will be garbage. I believe it's a font issue for the pop-up boxes in the case of Netscape. I haven't tested this in Netscape 4.5, nor IE 5.0. Regards, Andrea -- Andrea Vine Sun Internet Mail Server i18n architect avine@eng.sun.com Remember: stressed is desserts spelled backwards. Klaus Weide wrote: > > On Wed, 14 Apr 1999, Jason Pouflis wrote: > > > Browsers then did not submit the charset encoding along with data > > nor could I find a pre-fabricated solution for best guessing encoding type. > > This may have changed, please forward useful responses or your summary. > > > > wrt to testing on different browsers, I found that although my > > utf-8 pages would display properly on > > IE4 (english + japanese IME) on Win95/NT (english), > > that they didn't display properly on > > IE4 (japanese) on Win95 (japanese). > > > > > > A response I got on 13 May 1998 from Roman Czyborra was: > > ============================================== > > > How do I tell what character set form data is submitted in? > > > > There is a discussion of this issue in section 5 of RFC 2070. > > Ideally, the client sends something like > > > > Content-Type: application/x-www-form-urlencoded; charset=UTF-8 > > > > In practice, most browsers don't send the charset parameter and leave > > you to guessing what the data might be supposed to mean. > > Even Lynx 2-8-2 en Netscape 4.04 don't send it. > > Actually, Lynx (2-8-2 and some earlier versions) *is* able to send the > charset parameter if appropriate. It just doesn't always do it, in > order to not confuse existing scripts. But in a form with an > ACCEPT-CHARSET="utf-8" attribute, AND the submission data actually > containing non-US-ASCII characters, you should see the charset > parameter being sent. > > Klaus
Received on Thursday, 15 April 1999 14:18:24 UTC