Re: character sets in HTTP: translation tables

At 12:26 PM 12/28/94, Larry Masinter wrote:
>> 3. Is Unicode the answer?
>
>Unicode is one of the answers. It is perfectly possible to do
>multi-lingual HTML without Unicode, for those who would prefer to do
>so. Unicode may be the answer for you and for me, but it's become
>abundantly clear that it isn't the answer for some folks, and we don't
>actually need to force the issue this way.

I certainly agree that Unicode has to win on its own merits, rather than
being stuffed down people's throats, but I think one of those merits is its
ability to act as a character set "hub" for translation. In that way, it
can be used for HTTP without either the content on the server, or the
display end on the client, having to know anything about Unicode. This use
of Unicode is transparent to end users and suppliers of content, and so is
really an internal implementation detail of the client/server protocol.

>
>As for the HTTP protocol element, I think we might be better off with
>
>   accept-parameter: charset=unicode-1-1-utf7
>
>than
>
>   accept-charset: unicode-1-1-utf7
>

UTF-7 doesn't strike me as being appropriate for HTTP, since that's a
guaranteed 8-bit protocol. It seems like UTF-8 or UCS-2 would be more
appropriate. I realize people are just using it as an example now, but I
thought I'd point this out.

>
>================================================================
>As a final note, re:
>
>>  In addition, the MIME specification states that for the text/* data
>>  types, all line breaks must be indicated by a CRLF pair. This implies
>>  that certain encodings cannot be used within the text/* data types if
>>  the WWW is to be strictly MIME conformant.
>
>The MIME draft standard makes no such claims. There is a document
>being circulated by the mail extensions working group which makes
>stronger claims about text/* data types, but that document is not yet
>even a proposed standard.

This actually comes from an Internet draft, not a document circulating on
the mail extensions working group, and this is the next draft of the MIME
spec, so while what you say is literally true, unless the draft is changed
this is what's going to be in the MIME spec in its next edition.

David Goldsmith
Senior Scientist
Taligent, Inc.
10201 N. DeAnza Blvd.
Cupertino, CA 95014-2233
david_goldsmith@taligent.com

Received on Wednesday, 28 December 1994 17:18:06 UTC