Re: http charset labelling from Masataka Ohta on 1996-02-19 (uri@w3.org from February 1996)

From: Masataka Ohta <mohta@necom830.cc.titech.ac.jp>
Date: Mon, 19 Feb 96 11:28:05 JST
To: gtn@ebt.com (Gavin Nicol)
Cc: keld@dkuug.dk, dupuy@cs.columbia.edu, uri@bunyip.com
Message-Id: <199602190228.LAA05387@necom830.cc.titech.ac.jp>

Gavin;

> The actual indication of the encoding should be hidden from the user,
> but it is still important for it to be there,

Could you please remember that, because of duplicated encoding,
character code itself is necessary?

I already stated so more than 3 times.

> because even for ASCII
> names, maybe the user entered it in zenkaku.

Another good example of duplicated encoding. But you misunderstand how
Japanese encoding is.

With RFC 1468 ISO-2022-JP encoding, character 'A' may be represented
with ASCII, JIS X 0201 or JIS X 0208.

But, there is nothing like "zenkaku". It's a display property and
have nothing to do with encoding (though brain-deadly broken Unicode
OPTIONALLY allow to encode some display property, which should properly
belongs to the HTML).

The character 'A', latin capital letter 'A', in both ASCII, JIS X 0201
and JIS X 0208, without any ambiguity, have the same name of "LATIN
CAPITAL LETTER A".

That is, as URL is ASCII only, even if JIS X 0201 or JIS X 0208 'A'
is entered through ISO-2022-JP encoding, an ASCII code of "LATIN
CAPITAL LETTER A" must be sent.

Finally, with ISO-2022-JP encoding, you can put a lot of necessary and
redundant escape sequences. The sequences are totally invisible.
So, how can you figure out the corrent number of escape sequences
are?

						Masataka Ohta

Received on Sunday, 18 February 1996 21:37:33 UTC