Re: Globalizing URIs

Karen R. Sollins (sollins@lcs.mit.edu)
Fri, 11 Aug 1995 10:17:04 -0400


Date: Fri, 11 Aug 1995 10:17:04 -0400
Message-Id: <199508111417.KAA05540@lysithea.lcs.mit.edu>
From: "Karen R. Sollins" <sollins@lcs.mit.edu>
To: moore@cs.utk.edu
Cc: mduerst@ifi.unizh.ch, moore@cs.utk.edu, FisherM@is3.indy.tce.com,
In-Reply-To: <199508102157.RAA10879@wilma.cs.utk.edu> (message from Keith Moore on Thu, 10 Aug 1995 17:57:15 -0400)
Subject: Re: Globalizing URIs

Keith,

I agree with you.  What's more, "this group" (not in the formal sense
any more) discussed these issues at length over many months several
years ago.  At that time we agreed that we were not going for user
friendly names.  Use by humans was to be discouraged.  If they are to
be globally unique and long-lived and free of semantics, so that the
semantics will not be invalidated with time then they are not going
to be things that people would/should use, and such issues as user
friendly character sets should not be an issue.

Personally, I would like to discourage human transcription as well,
but the group did agree that that was an important feature, so we
should pick a character set that is limited enough that it is
transcribable on any keyboard we know of or can imagine.  As was said
earlier, if that means digits only, that's fine.  For a long while
we've been using the digits plus about 20 consonants.  No vowels, to
discourage any use of "words" in any language that we knew of.

While I'm on this topic, you might want to think about the fact that
there are several ways to embed meaning in some sense in a string such
as a URN.  One is to say that the characters in particular orderings
have meaning in one or another human language.  Another is that the
string as a whole has structure, perhaps (but not always) defined by
what one might consider puntuation.  This allows for partitioning into
components.  Each component string may itself not have meaning in a
human language, but the structure may convey meaning.

I believe that it was our intention, as described in RFC 1737, that
URNs were not required and probably not expected to have exposed
meaning in either sense.  That would certainly not prevent the creator
of a URN from embedding semantics.  In fact, URN creators might choose
to expose the semantics they embed, but they should know that the are
exposing their users and perhaps themselves to the sorts of problems
that Keith has been describing.

I don't recommend that we repeat all the discussions about semantics
and therefore character set again.  We should get through at least
complete one round of engineering of the full complement of components
needed to do identification of and access to objects in the net.

			Karen