Re: [URN] Re: URI documents

Roy T. Fielding (fielding@kiwi.ics.uci.edu)
Sat, 03 Jan 1998 00:14:48 -0800


To: Patrik Faltstrom <paf@swip.net>
cc: harald.t.alvestrand@uninett.no, moore@cs.utk.edu, uri@bunyip.com,
Subject: Re: [URN] Re: URI documents 
In-reply-to: Your message of "Sun, 28 Dec 1997 06:52:19 +0100."
             <Pine.GSO.3.96.971228063242.27472E-100000@nix> 
Date: Sat, 03 Jan 1998 00:14:48 -0800
From: "Roy T. Fielding" <fielding@kiwi.ics.uci.edu>
Message-ID:  <9801030022.aa08054@paris.ics.uci.edu>

Patrik writes:
>
>It _might_ be the case that a URN should be parsed differently than a URL.
>It might be that a totally new UR* should be parsed even differently than
>a URN and a URL. I agree with you that a design like that might be stupid,
>but the fact is that you do have some small common syntactic rules for
>URNs and URLs, and that is how you find which one it is. A URN is simple
>to recognise as it is prepended with the "urn:" string, but a URN is
>harder because the URL scheme is syntactically written in the position
>where the URI scheme should be. A parser must because of that have a list
>of all known URL schemes, and if the URI scheme is one of those, the
>identifier is a URL.

I don't think you understand the impact of the URN WG's decisions.
The "urn" is a URI scheme.  It is not "harder" or "easier" to interpret
than any other URI scheme --- you just give it to the "urn" handler,
which is then perfectly capable of giving it to some other sub-handler
if that is how the "urn" handler is designed.  The URI syntax doesn't
care about such things, because the URI parser doesn't care whether the
identifier is a URL or URN.  Those are scheme-dependent issues, not
URI issues.

The only application I know of that is dumb enough to use a fixed
list of known URL schemes is Navigator, and plain text scanners which
attempt to convert URLs in text to a hypertext reference.  Most
everything else is based on either the W3C/CERN libwww which uses a
registry of callbacks, or my own libwww-perl which uses module hooks.
This is because these architectures are designed for extensibility.
We all want this to be true, and even more prevalent in the future,
because URNs will never be deployed if they can't be used.

>Now, you simplify this by saying that syntactically, a URN can be parsed
>the same way as a URL, and one can install in the software a URN parser
>just like one does install a handler for a HTTP scheme or mailto scheme.
>
>Well, a lot of people probably do agree with you that that is the way one
>can _implement_ URLs and URNs, but that is not the way things are defined.
>What do we do when we get URZs which _doesn't_ have "URZ" prepended, but
>have the URZ scheme immediately in the beginning of the string, just like
>URLs? What happens if the market start writing URNs without the string
>"urn:" in the beginning of the string, and instead only write "isbn:" (you
>write in your document about the "side of the bus problem" regarding the
>fact that HTTP URLs loose their "http://" part)? Is "isbn" now a scheme?
>Yes, in the implementation it might be when you parse the string, but it
>is still a URN, and not a URL.

Yes, if people were not to include the "urn:" prefix, then it is no
longer within the "urn" scheme.  The scheme is not optional, nor will it
ever be optional.  I believe I've said this before on the URN list.

You are talking about a philosophical problem, and I am talking about
running code.  We need a definition that corresponds to the running code,
not to the philosophical problem.

>We have today two different types of URIs; URLs and URNs. What some of us
>ask for are your document divided in three so it is crystal clear what is
>a definition for URIs, what is URLs and what is URNs. I simply don't
>understand why you are opposing that so much?

Because what you are asking for is not true in current practice, nor can
it be defended by any implementations, nor is it capable of being defined
as a Draft Standard.  Aside from that, it is also poor design.  That is
why I oppose it so much --- I have no desire for a useless specification
that specifies nothing more than the territorial boundary between two
IETF working groups.

....Roy