- From: Ian Hickson <ian@hixie.ch>
- Date: Thu, 11 Sep 2003 08:39:15 +0000 (UTC)
- To: "kuro@sonic.net" <kuro@sonic.net>
- Cc: "www-international@w3.org" <www-international@w3.org>
On Wed, 10 Sep 2003, KUROSAKA Teruhiko wrote: >> >> If you have a form on a page that is ISO-8859-1, and the data that is >> submitted (either as GET or as POST) from that form contains characters >> outside the ISO-8859-1 repertoire, what should the UA do? > > Is this a question about the real behavor of the > popular browsers, or are you developing a browser? This is a question asked on behalf of Opera and Mozilla, both of which recently ran into this issue. > Assuming the latter, the browser is not obligated to send the input data > in the same charset as the form itself. It is, however, obligated to send the form submission in one of the character sets specified in the accept-charset attribute. > The browser can chose to send the input data in UTF-8, as Martin > suggested already. Unfortunately this is not a workable solution from three reasons: * If there's an accept-charset attribute, it's wrong to violate it. * There's no standard way to include character set selection information in a GET request (for forms with method="get"). * Most servers cannot handle UTF-8 when they expect ISO-8859-1. The first two are problems from a theoretical point of view, the last one is a practical problem that prevents us from doing this. > I don't think use of character entity is a right solution because the > character entity is a syntax used in HTML/XML and the data returned from > the form is not itself in HTML or XML. Agreed. Anyone have any other possible solutions? :-) -- Ian Hickson )\._.,--....,'``. fL U+1047E /, _.. \ _\ ;`._ ,. http://index.hixie.ch/ `._.-(,_..'--(,_..'`-.;.'
Received on Thursday, 11 September 2003 04:39:16 UTC