Re: [www-mling,00154] charset parameter (long) from Gavin Nicol on 1995-01-10 (ietf-http-wg@w3.org from January to March 1995)

From: Gavin Nicol <gtn@ebt.com>
Date: Tue, 10 Jan 1995 08:08:18 -0500
To: bobj@mcom.com
Cc: www-mling@square.ntt.jp, html-wg@oclc.org, http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com
Message-Id: <199501101308.IAA06632@ebt-inc.ebt.com>

I too, agree with Dan: if the data uses anything other than Latin1, it
should be tagged as such.. We need to force people to start using
tags, one way or another, and this seems a reasonable line to draw. 

I would also like to note (and I am sure everyone agrees), that this is
all a temporary solution. Things like:

>A web site with versions of the same files in different encodings
>(e.g., SJIS, EUC and JIS) or languages (e.g., English and Japanese) could
>create separately rooted trees with the equivalent files in each tree.
>The top page could say click here for SJIS/EUC/JIS or English/Japanese.

are obviously a poor solution at best. Larry's "conversion server"
idea is obviously preferrable. 

>Ken>I want to add one more thing about this issue. We could have the document
>Ken>which uses multiple charset in future. We must define the way to label
>Ken>such a document.
>Ken>It can be like ...
>Ken>        Content-Type: text/html; charset="ISO2022-JP", charset="ISO8859-6"
>Ken>Is this OK?

As Dan noted, you cannot do it this way. SGML (even if you play with
the declaration) *cannot* handle it. The <charset> tag idea is also not a
winner...

Forcing charset= usage will break some clients (perhaps all), but
given the distribution channels in Japan where everyone gets such
clients (magazines, Nifty, NTT), updating should not be hard. It will
certainly be easier than supporting broken software for years to
come....

Received on Tuesday, 10 January 1995 05:33:50 UTC