Re: http charset labelling

Keld J|rn Simonsen (keld@dkuug.dk)
Mon, 12 Feb 1996 21:44:08 +0100


Message-Id: <199602122044.VAA17228@dkuug.dk>
From: keld@dkuug.dk (Keld J|rn Simonsen)
Date: Mon, 12 Feb 1996 21:44:08 +0100
In-Reply-To: Masataka Ohta <mohta@necom830.cc.titech.ac.jp>
To: Masataka Ohta <mohta@necom830.cc.titech.ac.jp>, gtn@ebt.com (Gavin Nicol)
Subject: Re: http charset labelling
Cc: masinter@parc.xerox.com, uri@bunyip.com

Masataka Ohta writes:

> > I guess you, I, and a lot of other people, think that if people really
> > want to be global, they should avoid using kanji, or whatever, in
> > URL's. However, as a persoan at Astec said, and I agree, people *will*
> > put kanji into resource names, and they *will* expect it to work. As
> > such, I think it better to design a system that can handle *all*
> > cases, as users expect them to be handled.
> 
> Just make viewers bounce any URL with the 8th bit set or, at least,
> mask the bit. '%' notation should still be accepted.
> 
> It is also a good idea to do the same thing at the protocol
> specification level that:
> 
> 	8th bit of URL MUST be 0. Should a malformed URL is found,
> 	its 8th bit MAY be masked to be 0. Otherwise the URL MUST
> 	be rejected.
> 
> Then, non-ASCII URLs will disappear.

Well, URLs do not have a charset per se, they are abstract.
So possibly the URLs with % in them are more than ascii actually.
In fact they could be anything, and everything, like a UTF-8 URL. 

I do not care about how the URLs are looking on the HTTP level,
they may have as many % in them as needed, as long as the
URLs we write on business cards, in magazines aetc can be
natural, that is in evrey language and script of the world.

Keld