Re: report: URN Architecture Meeting at University of Tennessee, Oct 30-31 from Michael Mealling on 1995-12-01 (uri@w3.org from December 1995)

From: Michael Mealling <Michael.Mealling@oit.gatech.edu>
Date: Fri, 1 Dec 1995 10:58:10 -0500 (EST)
To: fielding@avron.ICS.UCI.EDU (Roy T. Fielding)
Cc: moore@cs.utk.edu, urn@mordred.gatech.edu, uri@bunyip.com
Message-Id: <199512011558.KAA16763@oit.gatech.edu>
Roy T. Fielding said this:
> 
> <stuff about schemes being good things> 
> 
> > on the other hand, 
> > 
> >   - if schemes tend to imply particular resolution protocols,
> >     they decrease persistence of URNs 
> 
> Yep, that would be bad, but it is also easy to avoid.  I would want any
> URN scheme to reflect a non-protocol name, like "oid:".

Yes.

<other stuff about schemes being bad but possibly good>

> >> You are assuming that there will be only one URN scheme.
> > 
> > No, I'm assuming all URNs will have a prefix that gives the 
> > client the ability to recognize it as a URN, and the minimum 
> > information necessary to use it.  (by "use it", I essentially 
> > mean "find a resolver for it").
> 
> Same thing. Honestly, that is how the Web is implemented; just look at
> any client library: my libwww-perl, CERN (now W3C) libwww, Guido's
> modules for python, etc.


It is NOT the same thing.

> You can implement URNs natively in libwww-perl simply by creating
> a perl module called www<scheme>.pl which includes a request() procedure.
> The library will load it dynamically when it encounters a URI with
> that <scheme>.  The library doesn't care whether the scheme is associated
> with a protocol or not -- it uses the scheme to select a resolution
> mechanism and thus any new scheme can be added without affecting any
> other part of the library.

That has been my WHOLE point from the very beginning for using
"URN:". I implemented a URN module in the CERN proxy a year and
a half ago. It was very simply to add into the web since I used
one scheme name to represent a CLASS of sub-schemes.

In the CERN library I implemented a protocol module called HTLoadURN
which takes the URN formatted thusly: URN:namespace-id:NA:opaque_string
and resolves it based on user configurable guesses as to hosts and
protocols to use for particular name-space ids. I.E: it would have
gone to some ISBN registrty server for ISBN numbers, looked up the
NA-host for DNS based stuff. Anyway, my point is that NOTHING in
the defintion of URIs keeps a particular URI from having sub-schemes.
By putting URN: in the front of a URN we are not saying that there
is one protocol for doing it. We just saying that urn: is one particular
URI scheme. Just like http is one particular scheme. Since URIs don't
shouldn't make assumptions about what's on the right hand side of the
colon we can then say that the urn: scheme has something like a  
sub-scheme but we call it a name-space identifier.

This seems to make much more sense from an implementation side since it
buys you:

	the abilty to have one URN library fall under one uri scheme 

	segregation from the rest of the URLs so there is no confusion
	about semantic changes between URNs and other URIs. I.e. you
	know http: acts differently than news: which acts differently
	from urn:

	you preserve much better extensibility since you can say that
	all URNs are under one URI scheme. Then you can introduce a 
	new URI latter without worrying about conflicts with current urns.
	


> Extensibility is one of the most important aspects of Web technology.
> This does not necessarily mean that all vendors have succeeded in
> implementing this extensibility; it only means that the design does not
> prevent them from implementing an extensible system.  Those of us who
> know better have done better.  If the implementation of IETF-sponsored
> URN's reduces the current URI extensibility, then I will not allow them
> to become a Web standard, even if that means divorcing Web standards from
> the IETF process [which I would personally hate to do].  However, that
> has never been necessary, because every time we have polled the vendors
> on this matter they have always supported a more extensible design.

Everytime YOU have polled the vendors. You should now as well as I
that who asks the question and how its asked has a severe impact on
the actual answer.

> The problem is that the change was never made to the URN specification
> because the authors didn't follow-through, forgot, or just plain disagreed
> with the rough consensus.  

I was there. The rough consensus was NOT what you represent it to be.
The vendors by themselves don't set rough consensus. Besides, the vendors
I've talked to seem fine with either method....

> That is not following the IETF process, which
> is why I am sick of repeating myself every time a new URN draft is
> produced, and is the primary reason why so little progress has been
> made over the past three years.

Maybe its because you were the only who was saying what you said.
The IETF has alwasy recognized that one person, no matter who they
represent, cannot make consensus.

> > (But I've never thought of a URN as being tightly bound to a "resource" 
> > ... it's bound to a "definition".  So a URN for today's weather map and 
> > a URN for the weather map on 11/29/95 would be different because they 
> > *mean different things*, and it doesn't matter that under certain 
> > circumstances they could refer to the same resource.  But this is 
> > independent of whether URNs have "schemes".)
> 
> It matters if the question asked is "have you seen the contents of
> this map before?", or "by which name should I refer to this resource when
> I put it in in a hotlist/bookmark file?".  You are right though in that
> this example does not highlight the need for schemes.

I would hope you would put the URC in your hotlist since THAT would give
the answer to  your question, not the URN.

> >> Requiring that all URNs have the same properties (i.e., case insensitive,
> >> references an entity fixed-for-all-time, etc.) would make it impossible
> >> to represent resource names as URNs.  
> > 
> > Depends on what you mean by "resource names".  I have always assumed
> > that URNs must be able to subsume other naming systems that have the
> > same basic properties -- global uniqueness, persistence, transcribibility,
> > etc., but not that URNs must be able to subsume any kind of resource name 
> > (such as a URL or a file name).  Now if the other naming systems that
> > we need to subsume into URN space are really so diverse that we cannot
> > define a common "umbrella" syntax and registry and clients have to 
> > be aware of the differences in their syntax in order to "use" them...
> > well, I'm tempted to suggest that we try to solve a narrower problem.
> 
> I think that's reasonable, but the name "URN" refers to the larger problem
> of location-independent resource names.  Solving a narrower problem is fine
> provided that it does not prevent others from solving other parts of the
> problem, which means that you must have a way to differentiate between
> solutions, and thus a scheme other than "URN:" is necessary.


No. A SUB-SCHEME other than 'foo' may be necessary. If the namespace
has the properties that Keith enumerated then it can go under the
umbrella of URN:, BUT if Keith's RCDS/BFD solution doesn't match yours
you can come up with a new namespace and new protocols (which aren't bound).


> 
> > But I'm not yet convinced that we need to support this kind of
> > diversity...perhaps you could supply some firm examples?
> 
> What I am saying is that unless you can *prove* to me that we will never
> want to support that diversity, you cannot make that choice for others.

I think we do need to support this kind of diversity. We just need
to be able to identify the class under which we diversify...

> >...
> > It's not as if everyone uses the word "scheme" in the same way.
> > (sorry, couldn't resist...it's one of my favorite quotes.)
> 
> Cute, but there is only one way to use the word "scheme" when referring
> to the characters preceding a Uniform Resource Identifier.  I am not
> interested in redefining the name associated with two proposed standards
> and an installed base of >20million applications.

Netscape has more than illustrated that changing an install base of
that magnitude (while possibly not 100% complete) can be done.



> 
> >> A scheme defines the syntax and
> >> semantics associated with the remainder of the identifier.  It does not
> >> define the resolution protocol; some identifiers have a scheme name which
> >> matches a protocol name because that is the most meaningful name to
> >> associate with a locator for which the ultimate resolution process defaults
> >> to using that protocol.  In other words, the Knoxville proposal is using
> >> the scheme "URN".

I think we have a very very bad definition problem here. Your using
web terminology ONLY. A lot of us DON'T use web terminology specifically
because (unlike others) the web is not where do 100% of our work.

You are correct. In web terminology we want to use the scheme name "URN".
The syntax and semantics associated with the remainder of the identifier
are such that there is may be one or more sub-schemes specified that
then determine how the syntax and semantics of the rest of the identifer.

> > The Knoxville proposal doesn't define the syntax of the name past the NA.
> 
> Yes it does -- it defines that it is opaque and case-insensitive and
> only includes a restricted set of characters.

Did we decide case insenstive? I though we went with case sensitive?
The character set restrictions depend on syntax so thats up in the air.
I agree with you that both there needs to be as few character set
restrictions as possible.

> 
> >> World-Wide Web user agents use the identifier scheme to determine the
> >> resolution mechanism (NOT protocol -- mechanism is that *thing* which is
> >> responsible, within or outside the client, for resolving identifiers of
> >> that particular identifier type -- it may use any protocol defined by
> >> the user or vendor for resolving that scheme, including a protocol defined
> >> on-the-fly through retrieval of a script).  
> > 
> > While I agree with you in principle, this is not the case in general.
> > It's certainly possible to add a layer of indirection between a URL
> > and its servers.  But since the web wasn't designed with a standard
> > layer there from day one, it's somewhat difficult to add one now and
> > see it universally deployed.  (doesn't mean it's not a good idea --
> > it's just difficult)
> 
> Wrong.  The design has been there since day one -- in fact, it preceded
> the original definition of Universal Document Identifiers, which preceded
> the creation of the URI WG for the purpose of standardizing those identifiers.
> Schemes were designed to support extensibility of names by allowing the
> library resolver module to be determined by scheme name.  It was also in
> libwww-perl since day one.

Right. The only thing we want to do is create a new scheme name. In today's
web terminology we want to create a new URL type called "urn" that, like
the news scheme, has different possible syntaxes. In this case the
syntax is determined by a string that can come somewhere near the front
of the URL.

> 
> >> Uniform Resource Names is a category of identifiers, referring to those
> >> that identify a resource independent of its network location.  It is wrong
> >> to use "URN" as a scheme name for the same reason it is wrong to use
> >> "URL" as a scheme name.
> >> 
> >>    I CANNOT USE ANY IDENTIFIER THAT BEGINS WITH "URN:"
> > 
> > Sure you can.  You can use URN: as easily as HTTP:.
> 
> Actually, I can't use HTTP either, since schemes are required to be
> lowercase.

You didn't answer his question. What keeps you from using a new
URL scheme called 'urn'? I've already done it. Its really easy.

> > I don't really care what these things are called.  I do care about
> > not defining lots of new URI prefixes such that the client has to
> > know about each one of them individually, or so that URNs get confused
> > with URLs.  So in response to your all-caps statement, I might say:
> > 
> > 	I CANNOT USE MORE THAN ONE NEW URI PREFIX
> > 
> > although that, of course, is also false.  I do, however, think it's
> > highly undesirable to keep extending things in this way.
> 
> If you implement a "truly great" URN with a particular scheme, and
> it turns out that you are right in that your "truly great" URN is
> sufficient to solve the URN problem in general, then nobody will bother
> to use some other URN that is "less great".

You assume that this is possible. Its not. No scheme can be all things
to all people. Just ask a few librarians....

> If, however, you are wrong in that some other URN syntax is better than
> that proposed, or if some other type of URN is necessary to solve the
> bits of the URN problem which you did not consider "important enough",
> then allowing multiple URN schemes to exist will allow the proof to be
> determined by implementation and successful deployment, not by
> pre-standardization posturing.
> 
> If this is just a difference of opinion between "extensibility is bad"
> and "extensibility is good", then there is no point is continuing this
> discussion.


Its not. It a difference of definition about what your both calling 
scheme! To you, Roy, the scheme is "URN:", to Keith and a lot of 
other people scheme is "OID", "PATH", "X-DNS-2", etc.

> >> Which means, obviously, that I will forbid the use of such an identifier
> >> in any system which I design or am responsible for standardization.
> >> That is what I've said consistently for over 1.5 years now, that is what
> >> I will recommend to the W3 Consortium members, and that is the objection
> >> I will continue to raise every time this is discussed within the IETF.
> >> 
> >> Is that clear?
> > 
> > In the IETF at least, you have no authority to forbid any such thing.
> 
> I wasn't referring to IETF standards.  URN is not an IETF standard.
> URN isn't even an IETF working group.  Right now, URN isn't even out
> of the early research phase.  I do have the authority to forbid the
> use of bogus URNs in any system *I* design, and in any system in which
> *I* am responsible for standardization (e.g., the W3C use of URIs).

In that case how do you plan on stopping Microsoft and Netscape from
changing the entire web underneath you?

> To the extent that my responsibility overlaps with that of the IETF,
> I defer to the IETF.  However, the IETF's responsibility *never*
> extends to systems that are not yet implemented.  Mine does.

So if the IETF standardises something and everyone uses it you will
still object to it? That sounds like ego talking rather than
someone who works in standards bodies....


> > We make decisions by rough concensus, but the concensus of the group can 
> > override any individual.
> 
> Only if that consensus is polled for on the working group mailing list
> and the results are represented in the WG documents.  In the entire
> history of the URI WG, the only time that the "URN:" prefix *ever*
> obtained consensus was at a meeting at a bar during the Houston IETF
> meeting -- yes, that's right, it wasn't even a legitimate decision
> of those in attendance at the real meeting.


In all the subsequent discussion the only people I remember not wanting
the URN: (and URL:) were you and Dan Conolly. That's two people. 
If anyone else doesn't want it then PLEASE send mail to the list
or else you WILL BE IGNORED.

> > I personally would think it silly for us to develop this new kind
> > of identifier that we have been calling a URN all along, and use
> > any prefix for that identifier other than URN:.  But if "silly" doesn't
> > cause any implementation or operational problems, you might be able 
> > to get the group to go along with you and use some other prefix.
> > 
> > On the other hand, if we end up defining lots of new URI prefixes, 
> > we will have been wasting our time for the past 4 years or so, because
> > we will have effectively gained nothing over normal URLs.  That's 
> > not silly, that's tragic.
> 
> Since when is the existence of only one URN scheme the sole advantage
> of location-independent names?  The only thing that has been wasting our
> time for the past 4 years or so is this insistence on defining an identifier
> which is fundamentally incompatible with all existing practice.  I am trying
> to stop yet another waste of time before it starts again.  If existing
> practice will not be a concern of some future URN WG, then there should
> not be any URN WG in the IETF.


In other words the IETF should just document existing practice and just
get out of the way?


> >> Hell, ALL
> >> EXISTING IMPLEMENTATIONS OF URIs DEPEND ON THE EXISTENCE OF SCHEME NAMES.
> > 
> > This isn't a justification for anything in particular.  The reason we're
> > doing this little four-plus year exercise is that "existing implementations
> > of URIs" aren't sufficient for our needs.
> 
> NO -- URLs aren't sufficient for our needs.  There is nothing insufficient
> about the URI architecture and there is no technical reason to justify
> a change from that architecture.


WE'RE NOT CHANGING IT! 

The URI mechanism gives the developer the power to put anything he damn
well pleasese after that colon. That's what we're doing.

> > (C'mon.  Does the "scheme" really have to be the part of the URI before 
> > the first colon?
> 
> Yes.

Right. The scheme is "URN", NOT the namespace-identifier.

> > Do URNs really have to share a common syntax
> > with URLs... down to including the path structure?
> 
> No, but they must be usable within the same URI structure.


URI structure said "scheme:syntax specific string", right?
That's what we're doing...

> > If you really want
> > URNs to be persistent, you don't put any semantically loaded
> > information in them at all...certainly not information that reflects
> > the internal structure of multi-file documents.)
> 
> I have seen no implementation that proves such a theory, though I have
> never suggested that all URNs must contain structural information either.
> I believe there is no harm in allowing both to coexist.

Its a general statement in Tanenbaums Distrubted OS book in the section
on naming....

> >> If you don't support the identification of resources that may already
> >> be on your local disk, identified within a personal database of resources
> >> located in a real-world bookshelf, or located within the user's local
> >> University library, then you have failed to solve the URN problem.
> >> You don't have to define these resolution mechanisms -- you just have
> >> to make them possible with minimum difficulty.
> > 
> > Actually, we do support the identification of such resources, but not
> > with names that indicate where the resources are stored.  After all,
> > a resource originally in my personal database could eventually become
> > available to the entire world...should the resource name then change and
> > then invalidate all of the references to it?
> >
> > But I could certainly configure my client to search my personal resource
> > database, my mail folders, etc.  before searching the DNS registry.
> 
> According to what constraints?  Do you want every query to search all
> available sources?  Or, do you want the sources to be ordered and targeted
> according to the likelihood of their knowledge about the resource?
> If you know a name is associated with a University Technical Report,
> don't you want your client to search the TR database before the
> library of congress?  If so, how does the client get configured for
> such preferences without looking at the opaque identifier after the
> "scheme:"?
> 
> The fact is that you cannot anticipate all the needs that I or anyone
> else may eventually have for URNs, so don't assume you have.  Provide a
> syntax that is extensible not because it will be, but because you cannot
> be sure it won't need to be.

Does this violate the Tao of the URI 

scheme:sub-scheme:sub-scheme-specific-string

????

-- 
------------------------------------------------------------------------------
Life is a game. Someone wins and someone loses. Get used to it.
<BR>
<HR><A HREF="http://www.gatech.edu/michael.html">Michael Mealling</A>
Received on Friday, 1 December 1995 10:58:39 UTC