specification for form sumission from Yung-Fong Tang on 2002-02-19 (www-international@w3.org from January to March 2002)

From: Yung-Fong Tang <ftang@netscape.com>
Date: Tue, 19 Feb 2002 08:50:42 -0800
To: www-international <www-international@w3.org>, Katsuhiko Momoi <momoi@netscape.com>, Bob Jung <bobj@netscape.com>
Message-ID: <3C728262.6090605@netscape.com>

I wonder is there a w3c specification address the following issue:

Background:
All HTML could encoded with a charset, either by labeled by HTTP header 
or HTML meta tag. When the browser submit the form data to the server, 
for backward compatability reason, we should send the data in the url 
escaped form of the form charset. However, since it is possible to put 
any unicode data into the text feild, what should the browser do when 
the data it need to submit cannot be convert to the charset of the form 
html.

I observed/heard about the following behavior:
1. prohibit the input, copy and paste of any characters which cannot be 
convert to the charset- Netscape 4.x did that. So there are no way to 
put Korean characters into ISO-8859-1 form. In this case, what you see 
is what you submit.
2. replace characters cannot be submit to '?' (N6.2 do that)
3. if there are ACCEPT_CHARSET specified in the HTML form , try to 
convert to different charset. (HTML 4.x say something about this). 
However, it will be very bad if one value is in one charset and the 
other is in a different one.
4. try to convert to UTF-8 if that happen. Same issue as above, we don't 
want to see one value in one charset and the other one in a different one.
5. convert it to the form charset, and for those character cannot be 
converted, conver it to NCR &#12345; and then % escaped (the IE6 on my 
WinXP do that)

Received on Tuesday, 19 February 2002 11:51:30 UTC