Re: http charset labelling

Masataka Ohta (mohta@necom830.cc.titech.ac.jp)
Mon, 19 Feb 96 11:28:05 JST


From: Masataka Ohta <mohta@necom830.cc.titech.ac.jp>
Message-Id: <199602190228.LAA05387@necom830.cc.titech.ac.jp>
Subject: Re: http charset labelling
To: gtn@ebt.com (Gavin Nicol)
Date: Mon, 19 Feb 96 11:28:05 JST
Cc: keld@dkuug.dk, dupuy@cs.columbia.edu, uri@bunyip.com
In-Reply-To: <199602190101.UAA09669@ebt-inc.ebt.com>; from "Gavin Nicol" at Feb 18, 96 8:01 pm

Gavin;

> The actual indication of the encoding should be hidden from the user,
> but it is still important for it to be there,

Could you please remember that, because of duplicated encoding,
character code itself is necessary?

I already stated so more than 3 times.

> because even for ASCII
> names, maybe the user entered it in zenkaku.

Another good example of duplicated encoding. But you misunderstand how
Japanese encoding is.

With RFC 1468 ISO-2022-JP encoding, character 'A' may be represented
with ASCII, JIS X 0201 or JIS X 0208.

But, there is nothing like "zenkaku". It's a display property and
have nothing to do with encoding (though brain-deadly broken Unicode
OPTIONALLY allow to encode some display property, which should properly
belongs to the HTML).

The character 'A', latin capital letter 'A', in both ASCII, JIS X 0201
and JIS X 0208, without any ambiguity, have the same name of "LATIN
CAPITAL LETTER A".

That is, as URL is ASCII only, even if JIS X 0201 or JIS X 0208 'A'
is entered through ISO-2022-JP encoding, an ASCII code of "LATIN
CAPITAL LETTER A" must be sent.

Finally, with ISO-2022-JP encoding, you can put a lot of necessary and
redundant escape sequences. The sequences are totally invisible.
So, how can you figure out the corrent number of escape sequences
are?

						Masataka Ohta