W3C home > Mailing lists > Public > www-tag@w3.org > September 2003

RE: Re: Form submission when successful controls contain characters outside the submission character set

From: Paul Cotton <pcotton@microsoft.com>
Date: Thu, 11 Sep 2003 18:18:26 -0400
Message-ID: <E7AC4500EAB7A442ABA7521D188143970795404C@tor-msg-01.northamerica.corp.microsoft.com>
To: "Chris Lilley" <chris@w3.org>, <www-tag@w3.org>

TAG members researching this item might want to review the following
material:
http://ppewww.ph.gla.ac.uk/%7eflavell/charset/form-i18n.html 

/paulc

Paul Cotton, Microsoft Canada 
17 Eleanor Drive, Nepean, Ontario K2E 6A3 
Tel: (613) 225-5445 Fax: (425) 936-7329 
mailto:pcotton@microsoft.com

  

> -----Original Message-----
> From: www-tag-request@w3.org [mailto:www-tag-request@w3.org] On Behalf
Of
> Chris Lilley
> Sent: September 11, 2003 5:24 AM
> To: www-tag@w3.org
> Subject: Fwd: Re: Form submission when successful controls contain
> characters outside the submission character set
> 
> This is a forwarded message
> From: Ian Hickson <ian@hixie.ch>
> To: "kuro@sonic.net" <kuro@sonic.net>
> Date: Thursday, September 11, 2003, 10:39:15 AM
> Subject: Form submission when successful controls contain characters
> outside   the submission character set
> 
> ===8<==============Original message text===============
> 
> On Wed, 10 Sep 2003, KUROSAKA Teruhiko wrote:
> >>
> >> If you have a form on a page that is ISO-8859-1, and the data that
is
> >> submitted (either as GET or as POST) from that form contains
characters
> >> outside the ISO-8859-1 repertoire, what should the UA do?
> >
> > Is this a question about the real behavor of the
> > popular browsers, or are you developing a browser?
> 
> This is a question asked on behalf of Opera and Mozilla, both of which
> recently ran into this issue.
> 
> 
> > Assuming the latter, the browser is not obligated to send the input
data
> > in the same charset as the form itself.
> 
> It is, however, obligated to send the form submission in one of the
> character sets specified in the accept-charset attribute.
> 
> 
> > The browser can chose to send the input data in UTF-8, as Martin
> > suggested already.
> 
> Unfortunately this is not a workable solution from three reasons:
> 
>  * If there's an accept-charset attribute, it's wrong to violate it.
>  * There's no standard way to include character set selection
information
>    in a GET request (for forms with method="get").
>  * Most servers cannot handle UTF-8 when they expect ISO-8859-1.
> 
> The first two are problems from a theoretical point of view, the last
one
> is a practical problem that prevents us from doing this.
> 
> 
> > I don't think use of character entity is a right solution because
the
> > character entity is a syntax used in HTML/XML and the data returned
from
> > the form is not itself in HTML or XML.
> 
> Agreed.
> 
> 
> Anyone have any other possible solutions? :-)
> 
> --
> Ian Hickson                                      )\._.,--....,'``.
fL
> U+1047E                                         /,   _.. \   _\  ;`._
,.
> http://index.hixie.ch/
`._.-(,_..'--(,_..'`-.;.'
> 
> ===8<===========End of original message text===========
> 
> Forwarding this message as evidence that the internationalization
> issues with GET for form submission are still acute and still not
> solved. PUT, with an XML body, solves them.
> 
> GET might solve them in the future, if for example the Accept charset
> specification is ammended to say that servers should or must accept
> UTF-8 (and perhaps UTF-16, though only one is needed) in the same way
> that XML parsets must accept UTF-8 (and UTF-16).
> 
> Until then, there will continue to be forms that cannot correctly
> transfer the text entered by a user, if they submit the results using
> GET.
> 
> --
> Best regards,
>  Chris                            mailto:chris@w3.org
Received on Thursday, 11 September 2003 18:19:13 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 26 April 2012 12:47:20 GMT