Re: Globalizing URIs

Martin J Duerst (mduerst@ifi.unizh.ch)
Tue, 8 Aug 1995 19:40:16 +0200 (MET DST)


Message-Id: <9508081740.AA12230@mocha.bunyip.com>
Subject: Re: Globalizing URIs
To: FisherM@is3.indy.tce.com (Fisher Mark)
Date: Tue, 8 Aug 1995 19:40:16 +0200 (MET DST)
Cc: uri@bunyip.com
In-Reply-To: <30278C2B@MSMAIL.INDY.TCE.COM> from "Fisher Mark" at Aug 8, 95 09:01:00 am
From: Martin J Duerst <mduerst@ifi.unizh.ch>

>Mark Fisher (fisherm@indy.tce.com) wrote: 

>Maybe I missed the proposal, but how about standardizing on Unicode for 
>URLs?  At most this should require one mapping table per environment, as I 
>doubt there are many users who require 3 or more incompatible character 
>sets.

In order to avoid any more confusion on this issue, and as the originator
of the confusion, I would like to give some initial information, with
more as necessary following tomorrow (pretty late here already).

The subject was discussed quite extensively on html-wg (see the archives
for this list at <http://www.acl.lanl.gov/HTML_WG/archives.html>).
In a response to a long posting of mine, Larry Masinter suggested
to take the discussion to this list, and also set the "reply-to" parameter
so that my answer was directly sent to the list. Fortunately I discovered
this this morning and was able to subscribe.

To give a short summary, the problem surfaced in an attempt to
write an internet-draft for full internationalization of HTML.
We have quite a good idea of how to manage things such as
various character sets for the contents of HTML documents,
but this does not apply to the names of these documens, or
put in other words to their URL.

Although an URL is technically spoken just a sequence of octets,
encoded with %HH if necessary, these octets in more cases than
not represent characters, and there are many occasions on which
it would be desirable to show the actual characters to a user,
which, in an international setting, is only possible if the
character set and encoding of these characters are known.

As several of the schemes for which there are URLs show
this problem and do not have a solution of their own, it
was felt that a general solution in the area of URLs might
be in place, and in this case indeed this list may be the
place to discuss it.

So much for today; I hope that the general idea became
clear enough. I will give more explanations tomorrow
as necessary.

Regards,	Martin.

----
Dr.sc.  Martin J. Du"rst			    ' , . p y f g c R l / =
Institut fu"r Informatik			     a o e U i D h T n S -
der Universita"t Zu"rich			      ; q j k x b m w v z
Winterthurerstrasse  190			     (the Dvorak keyboard)
CH-8057   Zu"rich-Irchel   Tel: +41 1 257 43 16
 S w i t z e r l a n d	   Fax: +41 1 363 00 35   Email: mduerst@ifi.unizh.ch
----