Re: charset issues

On Fri, 20 Dec 1996 Ed_Batutis/CAM/Lotus@crd.lotus.com wrote:

> In regard to character sets I'd like to see the following happen:

Good overview.

> In regard to POSTed data, there are good solutions and browser/server
> vendors need to agree on one (or more) as soon as possible.
> 
> Lacking any new mechanisms, servers should assume the form data is in the
> same encoding as the form document. Maybe this is the state-of-the-art
> anyway. In any case, this is obviously inefficient because if the server
> serves documents with different encodings it places an undue burden on the
> server. It would be better to tag the return data so that the server does
> not need to look at the original document.
> 
> I've read about or imagined several ways this can be done:
> 
> 1) Use the mechanism proposed by Larry Masinter in multipart/form-data.
> 
> 2) Use "application/x-www-form-urlencoded ; charset=<one of the well-known
> encodings>"
> 
> 3) Create a new Media Type that is identical to
> application/x-www-form-urlencoded but allows a charset parameter
> 
> 4) Use the "charset field" approach (a charset is sent back as the value
> for a special field that is hidden from the user)
> 
> 5) Send an Accept-Charset on the POST

Either you mean send Accept-Charset in the FORM, or you have forgotten
something.

The I18N HTML spec proposes that an attribute Accept-Charset be used
on textual input fields of forms.

The main case where this could be used is where the server sends
out the document in a well-known encoding (for efficiency reasons),
but expects the FORM answer to come back as UTF-8 (because a single
encoding is easier to deal with).

In the long run, UTF-8 is the encoding of choice for (short) form
submissions.

Regards,	Martin.

Received on Friday, 20 December 1996 16:11:34 UTC