Re: html, http, urls and internationalisation from Larry Masinter on 1996-01-31 (uri@w3.org from January 1996)

From: Larry Masinter <masinter@parc.xerox.com>
Date: Wed, 31 Jan 1996 00:43:53 PST
To: borka@e5.ijs.si
Cc: keld@dkuug.dk, yergeau@alis.ca, Dan.Oscarsson@malmo.trab.se, maits@dkuug.dk, uri@bunyip.com
Message-Id: <96Jan31.004354pst.2733@golden.parc.xerox.com>

> What Keld said is sound and could be worked further. THe major
> restriction is the DNS part and this should be kept as it is
> (character < 127). The same applies to the syntax characters.

No, "what Keld said" isn't "sound" it is just "sounds nice".

Keld said, for example,

> 1. URLs themselves.

> These are at an abstract character level, as Larry and Franc,ois
> correctly points out, you cannot see what is the charset
> when you look at a business card or an URL in the newspaper.

> I propose that any character here be allowed, except for the 
> URL syntax characters, (things like < / : ) - in the non-DNS
> part of the URL. Remember these are abstract characters, and
> there is no binding to for example ISO 10646 in the sense
> of a character repertoire, or to any encoding (charset).

However, this nice-sounding proposal contained no solution to the
following questions:

1)how do these abstract characters subsequently get turned
  into octets that are employed in real protocols in general
  and http and ftp in particular?
  (The current URL specification gives an algorithm.)

2)how does one translate a URL that uses a large character
  repertoire so that it might be written in a context with 
  a small repertoire? E.g., a URL with chinese characters
  in an ASCII email message.
  (The current URL specification manages this by limiting
  the repertoire.)

I don't think these problems are unsolvable, but I think in the course
of making a "sound" proposal you'll find that it starts "sounding"
less and less like something that you'd want to implement.

So, I'll ask again, PLEASE stop cross-posting this discussion to three
separate mailing lists.

Received on Wednesday, 31 January 1996 03:44:16 UTC