Re: draft-holtman-http-safe-00.txt from Roy T. Fielding on 1996-10-11 (ietf-http-wg@w3.org from October to December 1996)

From: Roy T. Fielding <fielding@liege.ICS.UCI.EDU>
Date: Fri, 11 Oct 1996 13:50:57 -0700
To: Gavin Nicol <gtn@ebt.com>
Cc: http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com
Message-Id: <9610111350.aa10673@paris.ics.uci.edu>

Well, for the sake of implementers I'll explain my reasoning on ...

>>This is not true.  The only time that web internationalization might
>>impact the choice of POST vs GET is when it is not known what the input
>>character set will be *and* it is possible for the user agent to submit
>>data in some character set other than what is expected *and* the form
>>does not contain an entry box for the user to select which particular
>>character set they are using *and* nobody has convinced the browser
>>community to include a standard hidden form field containing the charset
>>whenever the charset is not the same as that of the output form.
> 
> Incorrect. I18N needs message bodies if any one of the following is true.
> 
> 1) The coded character set or encodiung of transmitted data is
>    known and it is anything other than ISO 8859-1.

Nope.  Regardless of whether the charset of transmitted data is known or
unknown, what gets stuck in the GET URL query part is the sequence of
8-bit bytes encoded as if the bytes were ISO 8859-1 -- the server is
under no obligation, when decoding the URL query part, to treat those
bytes as anything other than binary data after removing the URL encoding.
It's messy, but that's what happens.  If the server knows what the charset
is intended to be, it can interpret the data in that charset (even if it
varies between form elements).

> 2) The server indicates that it can accept multiple coded character
>    sets and encodings, and one from the list is being used for the
>    transmitted data.

And the server provides no other means for the client to indicate what
the charset is.  One other such means is a FORM selection field above
the entry box.  Another, not currently available, would be an agreement
with user agent vendors to include the hidden field xxxx_CHARSET with
any input field xxxx for which the charset used is different than the
charset of the document containing the FORM.

There are many ways to solve the i18n problems using the existing
technology.  If that wasn't true, the solution would have been in HTTP/1.1.
It is only when we want the "best" or "cleanest" mechanism that we end up
running into barriers, but at least we are making progress on removing
those as well.

GET+body is useful, beyond i18n, because we also want to do things like
submit complex search queries using portable scripting languages, where
the language is indicated by the Content-Type and the request applies
to the entire server or a server subspace.

 ...Roy T. Fielding
    Department of Information & Computer Science    (fielding@ics.uci.edu)
    University of California, Irvine, CA 92697-3425    fax:+1(714)824-4056
    http://www.ics.uci.edu/~fielding/

Received on Friday, 11 October 1996 14:10:50 UTC