W3C home > Mailing lists > Public > www-international@w3.org > April to June 1999

Re: Form response charset

From: Jason Pouflis <pouflis@eisa.net.au>
Date: Wed, 14 Apr 1999 08:51:17 +1000
Message-ID: <005701be8600$2704a390$01010101@fireball>
To: "Jesse Hall" <jesse@Novonyx.COM>, <www-international@w3.org>
Jesse Hall wrote:
> I'm working on internationalizing a web-based application. One of the 
> requirements is that it must accept international input via forms. My problem is 
> that I haven't found a way of determining which character set the information 
> coming back from the browser is in 

I was looking at this exact same problem myself a year ago,
in order to develop a web based registry system for
multilingual domain names, which I haven't worked on since.

Browsers then did not submit the charset encoding along with data
nor could I find a pre-fabricated solution for best guessing encoding type.
This may have changed, please forward useful responses or your summary.

wrt to testing on different browsers, I found that although my 
utf-8 pages would display properly on 
IE4 (english + japanese IME) on Win95/NT (english), 
that they didn't display properly on
IE4 (japanese) on Win95 (japanese).


A response I got on 13 May 1998 from Roman Czyborra was:
==============================================
> How do I tell what character set form data is submitted in?

There is a discussion of this issue in section 5 of RFC 2070.
Ideally, the client sends something like

Content-Type: application/x-www-form-urlencoded; charset=UTF-8

In practice, most browsers don't send the charset parameter and leave
you to guessing what the data might be supposed to mean.
Even Lynx 2-8-2 en Netscape 4.04 don't send it. 
==============================================

Jason Pouflis     (in Sydney, Australia)
jason@superannuation.net
0411 444 786  mobile
e.internet  pty ltd
   e.business
   e.commerce
   e.mail
  multilingual domain names
Received on Tuesday, 13 April 1999 18:49:47 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Tuesday, 2 June 2009 19:16:54 GMT