Re: Forms and CharSets

At 00:21 99/09/22 -0700, Yung-Fong Tang wrote:
> 
> 
> "Martin J. Duerst" wrote:
> 
> > At 14:42 99/09/15 -0700, Yung-Fong Tang wrote:
> > >>>>
> >      The name "accept-charset" itself is very misleading.
> > <<<<
> >
> > The name is not misleading.
> 
> If the name is not misleading. Then people won't think it is to
> indicate which characters the client can accept, right. The
> origional questoin itself prove this name is misleading. Does
> this mean:
> 1. What the server/could accept, or
> 2. What the intput fields display by the cleint could accept
> Use this name in HTTP is not misleading. Use this name  IN HTML
> is misleading.

Well, then everything in HTML is misleading. You have to know
what a <p> is, or don't you?


> > <<<<
> >
> > The idea is not that the form tell the server what kind of charsets
> > the server has to accept, but that the form tell the browser what
> > kind of charsets the server actually accepts.
> 
> But this is not reliable , right. How can the HTML spec out the HTTP
> server limitation/features in a reliable way. HTTP header could
> provide reliable information about what it could accept. It is wrong
> for a HTML to indicate what the server could accept.

Why? As I said:

> > While the form may come from a different server than the server
> > where the CGI script is located, the author of the form has to
> > have some knowledge about what the CGI script can handle, and
> > assuming that it can know about what charsets the CGI script
> > can handle is a reasonable extension.

Of course this can go wrong, if the server is changed to accept
different charsets. But then, anything with forms can go wrong,
because the server may change the query fields it accepts, or
change what it thinks they mean.


> And the most funny
> part is the HTML 4.0 editor simply copy the text
> from HTTP 1.1 spec without modify it corretly. The originoal
> text simply indicate the accpt-charset which the client send out.
> How can a HTML spec spec out something like "The server must...." ?

Agreed already.

> > The text above is indeed a bit unclear, and probably should be fixed.
> 
> I agree it should be fixed.


By the way, this was clearer in RFC 2070,
but seems to somehow having got mungled. From RFC 2070
[http://ietf.org/rfc/rfc2070.txt]:

   To ensure full interoperability, it is necessary for the user agent
   (and the user) to have an indication of the character encoding(s)
   that the server providing a form will be able to handle upon
   submission of the filled-in form.  Such an indication is provided by
   the ACCEPT-CHARSET attribute of the INPUT and TEXTAREA elements,
   modeled on the HTTP Accept-Charset header (see [HTTP-1.1]), which
   contains a space and/or comma delimited list of character sets
   acceptable to the server.  A user agent may want to somehow advise
   the user of the contents of this attribute, or to restrict his
   possibility to enter characters outside the repertoires of the listed
   character sets.

[please note that there is one intentional change between RFC 2070
and HTML 4.0; the Accept-Charset attribute is on <form> for the later].


Regards,   Martin.


#-#-#  Martin J. Du"rst, World Wide Web Consortium
#-#-#  mailto:duerst@w3.org   http://www.w3.org

Received on Wednesday, 22 September 1999 04:01:56 UTC