Re: URL internationalization!

Martin J. Duerst (mduerst@ifi.unizh.ch)
Mon, 24 Feb 1997 18:22:21 +0100 (MET)


Date: Mon, 24 Feb 1997 18:22:21 +0100 (MET)
From: "Martin J. Duerst" <mduerst@ifi.unizh.ch>
To: "Gregory J. Woodhouse" <gjw@wnetc.com>
Cc: "Roy T. Fielding" <fielding@kiwi.ICS.UCI.EDU>,
Subject: Re: URL internationalization!
In-Reply-To: <Pine.SGI.3.95.970220153223.21842A-100000@shellx.best.com>
Message-Id: <Pine.SUN.3.95q.970224175050.245Q-100000@enoshima>

Hello Gregory,


On Thu, 20 Feb 1997, you wrote:

> I believe Roy's comments are right on the mark here. URLs are resource
> identifiers which have the added convenience of being easily transcribable
> and can be chosen to have some mnemonic value, but it is not their function
> to implement a directory service.

It's definitely not the function of URLs to implement a directory
service. URLs hit or fail, (most) directory services give you a possibly
long list of alternatives.

There are other problems when trying to solve the internationalization
problem with directories above URLs. Many of them are general problems
of not having internationalization available for URLs, such as the
problems with the query part, the problem with the display of URLs
in various places in the browser, and so on. Here are two more,
specific to directory services:

- There are a lot of different proposals for directory services,
	some of them caring about internationalization, but most
	of them not yet addressing this point. However, there is
	no one directory service that is in full use, and it's
	very difficult to know when there will be one.
- Directory services themselves will create the need for schemes
	to access them with URLs. The dog bites it's own tail.

The last one in particular is a serious knockout :-).


> To use a worn out example, URLs are to
> the web as i-nodes are to the Unix filesystem. Just as directories give us
> meaningful file names under Unix (and not just i-node numbers), directory
> services on the Internet are the right way to introduce names which are
> meaningful in local character sets.

If URLs were like i-number and internet addresses (and port numbers),
you would see a lot more of things such as
	21://130.60.48.8/210900
instead of
	ftp://ftp.ifi.unizh.ch/pub/multilingual/FontComposition.ps.Z
In the first, using port number for scheme, and i-number for path/resource
name is not possible (but the later could be simulated), and using the
internet address is rarely done.

No we can either say that URLs, file names, and domain names are
low-level directory services, or we can say that URLs, file names,
and domain names are not directory services, and that for example
the unix "find" command would be a (crude) directory service for
a file system, and a search engine would be a directory service
for domain names or URLs.
Which way we take to solve the terminology problem, it remains that
URLs, file names, and domain names all are identifiers that are
frequently used by human beings. And where this is possible, many
human beings prefer to use these identifiers in their local
script. Countless filenames on countless computers all over the
world are a strong testimony. The (not fully guaranteed but
in many case working) use of URLs in native encoding in HTML
documents is another strong testimony. For domain names, we don't
have such a testimony, but I know that where this is possible
(e.g. NT), users like to give their machines names they can
understand and that are written in a familliar script.


> An advantage of directory services is
> that it is possible to have multiple directory services for the same file
> system (as a simple example think of long filenames and 8.3 filenames under
> Windows 95). The same web could easily support Directory services for
> English and Japanese.

You can have multiple filenames without having a different directory
service. Links do the job. You of course can have multiple URLs
for one and the same resoure (apart from terminology problems :-).


> It may sound a bit strange for me to say this now as I have objected in the
> past to the idea that certain URI schemes should use what are in essence
> binary strings. That URLs are transcribable and easily remembered is a
> considerable convenience, and this feature of the URL mechanism is a great
> asset. However, this should not obscure the fact tht URLs are not meant to
> implement a directory service.

We don't want URLs to become a high-level directory service.
But we want URLs to provide the same convenience for users
around the world. Every US user is free to choose an URL
with numbers only, very few are doing it, not even those
that vehemently claim that URLs should be without meaning.
Yet people who don't use the Latin script, and to a lesser
extent people that use the Latin script with extensions,
are constantly forced to use URLs that are for them rather
meaningless and arbitrary.


Regards,	Martin.