W3C home > Mailing lists > Public > ietf-http-wg-old@w3.org > January to April 1996

Re: html, http, urls and internationalisation

From: Larry Masinter <masinter@parc.xerox.com>
Date: Sun, 28 Jan 1996 20:54:38 PST
To: yergeau@alis.ca
Cc: keld@dkuug.dk, Dan.Oscarsson@malmo.trab.se, html-wg@oclc.org, http-wg%cuckoo.hpl.hp.com@hplb.hpl.hp.com, maits@dkuug.dk
Message-Id: <96Jan28.205446pst.2733@golden.parc.xerox.com>
> Perhaps this is a hint that there is a user requirement that hasn't
> been met?

I completely, 100%, without a doubt, with great fervor, absolutely
agree that the issue of internationalization in URLs as they are used
today is a serious problem. I am not disagreeing with your
characterization of the problem. It's that the solutions proposed so
far don't actually work, or solve the problem while breaking something
else. 

> Doesn't work.  If I pick a URL containing cyrillic letters from a KOI8
> document, and retype it in an ISO-8859-5 document keeping the
> *characters* constant, the octets will change and the link won't work...

There *are* no legal URLs containing cyrillic letters. There are no
cyrillic letters in the set of allowed characters of URLs in RFC 1738.

Now, you might imagine a world where URLs might contain cyrillic
letters, and then, yes, you would have a problem, because there are
multiple ways to interpret cyrillic letters as octets.

It would be lovely if we had a world-wide directory service that would
let users type in short strings that they found on billboards, heard
on the radio or read on someone's business card, and that the user's
type-in could be in the user's own language and character set.

It would be lovely if that system weren't necessarily linked to DNS
and existing protocols, too, so that material could migrate from ftp
to gopher to http without having to change all of the pointers.

It might also solve the problems that are exemplified by by
'pimples.com' and 'toystory.com': the .com DNS name space filling up
with product names, trademarks, and short-lived material that has no
particular relationship to the original intent of DNS.

I think this might even be a tractable problem, but solving it doesn't
mean trying to take an axe to either URLs, HTTP, HTML, but creating a
different method of reference.

Unfortunately, this doesn't seem to be a problem that the URN
community is interested in solving either, so it might find a home
elsewhere.

I really dislike these cross-posted conversations on HTTP-WG, HTML-WG
which are about topics that are out of scope for either working group.

> Anyway, the URI-WG is dead.

The URI mailing list is still open, uri@bunyip.com, and I think that's
as reasonable a place to follow up on this as any.

Please, no more here (HTTP-WG), or here (HTML-WG).
 
Received on Sunday, 28 January 1996 20:56:21 EST

This archive was generated by hypermail pre-2.1.9 : Wednesday, 24 September 2003 06:31:43 EDT