Re: Form submission when successful controls contain characters outside the submission character set

Ian,


>>The browser can chose to send the input data in UTF-8, as Martin
>>suggested already.
> 
> 
> Unfortunately this is not a workable solution from three reasons:
> 
>  * If there's an accept-charset attribute, it's wrong to violate it.
>  * There's no standard way to include character set selection information
>    in a GET request (for forms with method="get").
>  * Most servers cannot handle UTF-8 when they expect ISO-8859-1.

I see.

In that case (accept-charset does not include Unicode charsets),
the best "solution" may be simply replace those
out-of-charset characters with a replacement character,
probably '?', on transmission.

If the form itself is written in ISO-8859-1 or any
other traditional charsets other than UTF-* or other Unicode based
charsets, and if accept-charset is not there or it does not
include UTF-8, the web app is probably not prepared to handle
those characters.  That is, even if we come up with a creative
way of transmitting these out-of-charset characters, that would not
solve the real problem: the web app doesn't handle out-of-charset
characters.  In other words, I would expect the fully internationalized
web apps to use UTF-8 for the form (or declare it can accept UTF-8
using accept-charset and use POST instead of GET), and to
interpret charset attribute in C-T header.

Do you have a particular use case where sending the
out-of-charset characters may be benefitial?

Regards,
-- 
T. "Kuro" Kurosaka, San Francisco, California

Received on Thursday, 11 September 2003 12:37:14 UTC