- From: KUROSAKA Teruhiko <kuro@bhlab.com>
- Date: Thu, 11 Sep 2003 09:37:57 -0700
- To: Ian Hickson <ian@hixie.ch>
- Cc: "kuro@sonic.net" <kuro@sonic.net>, "www-international@w3.org" <www-international@w3.org>
Ian, >>The browser can chose to send the input data in UTF-8, as Martin >>suggested already. > > > Unfortunately this is not a workable solution from three reasons: > > * If there's an accept-charset attribute, it's wrong to violate it. > * There's no standard way to include character set selection information > in a GET request (for forms with method="get"). > * Most servers cannot handle UTF-8 when they expect ISO-8859-1. I see. In that case (accept-charset does not include Unicode charsets), the best "solution" may be simply replace those out-of-charset characters with a replacement character, probably '?', on transmission. If the form itself is written in ISO-8859-1 or any other traditional charsets other than UTF-* or other Unicode based charsets, and if accept-charset is not there or it does not include UTF-8, the web app is probably not prepared to handle those characters. That is, even if we come up with a creative way of transmitting these out-of-charset characters, that would not solve the real problem: the web app doesn't handle out-of-charset characters. In other words, I would expect the fully internationalized web apps to use UTF-8 for the form (or declare it can accept UTF-8 using accept-charset and use POST instead of GET), and to interpret charset attribute in C-T header. Do you have a particular use case where sending the out-of-charset characters may be benefitial? Regards, -- T. "Kuro" Kurosaka, San Francisco, California
Received on Thursday, 11 September 2003 12:37:14 UTC