RE: Issue: URI_URL, proposed changes from Clemm, Geoff on 2002-07-13 (w3c-dist-auth@w3.org from July to September 2002)

From: Clemm, Geoff <gclemm@rational.com>
Date: Sat, 13 Jul 2002 18:11:41 -0400
To: "'Webdav WG (E-mail)'" <w3c-dist-auth@w3c.org>
Message-ID: <3906C56A7BD1F54593344C05BD1374B107761985@SUS-MA1IT01>
   From: Julian Reschke [mailto:julian.reschke@gmx.de]

   > From:  Clemm, Geoff
   >
   > ... I (and RFC-2396) was making the opposite point, namely that
   > two different resources cannot be identified by the same URN.  So
   > if two resources have the same URN, you are guaranteed that they
   > are the same resource.

   I mis-read what you wrote, but I still disagree (now in a different
   way). A URI (no matter whether URN or URL) identifies a
   resource. So two different resources can't have the same URL
   either. That's by definition.

Sure they can.  Create a resource at /foo.html.  Create a resource at
/bar.html.  These are separate resources.  MOVE /foo.html to
/bar.html.  /bar.html now identifies a different resource.  That is
the semantics of a URN ... if a URN identifies a given resource at
time X, it also identifies that resource (or no resource at all) at
time Y.

   >    > In WebDAV, we have some URIs that we SHOULD/MUST be usable
   >    > to locate the resource.  We should call those "URLs" to emphasize
   >    > that point.  For example, the contents of the DAV:href elements
   >    > returned in DAV:response should be URLs (or more accurately,

   Actually, they MUST be URLs in the same scheme as the request URI...

That is true as well, but not relevant to the current discussion.
It is true that any http: URI is a URL.  But in general,
different URIs in a specified scheme may be URLs, URNs, both, or neither.
For the "http:" scheme, all URIs are URLs and some URIs are URNs.
For the "urn:" scheme, all URIs are URNs (or certainly should be),
and some URIs are URLs.

   >    > URL-references).  We also have some URIs which name a resource
   >    > (in particular, lock tokens).  These should be called URNs.
   >
   >    I disagree. URLs are just a subset of URIs.
   >
   > We don't disagree there.

   RFC2518 has defined that lock tokens are identified by URIs with no
   restriction.

That is incorrect.  The first line of section 6.4 (that defines a lock
token URI) states: "The opaquelocktoken URI scheme is designed to be
unique across all resources for all time."  That makes an
opaquelocktoken URI a URN, and disallows using non-URN URIs.

   An attempt to restrict lock tokens to specific URI types (are
   you trying to do that?) 

No, I have (several times :-) emphasized that whether or not a URI
is a URL or URN does not require that the URI belong to a specific
URI type.  A URI in any scheme can be a URL, if it can be used to
apply a method to the resource identified by that URI.  A URI in any
scheme can be a URN if it satisfies the semantics requirement of a URN.

   can break existing code with no benefit -- the only thing a client
   should do with a lock token is to remember it, and to resubmit when
   needed -- from it's point of view it just a string, and clients
   MUST not make any assumptions about the type of the URI scheme in
   use.

And I am not suggesting that they make any such assumption.  But they
can (and should) make the assumption that any lock token URI is
a URN (i.e. globally and uniquely identifies that resource).

   >    When we talk about URIs that identify WebDAV resources, we can talk
   >    about URLs (because we know that they use the http: or https:
   >    scheme). Otherwise, we shouldn't make any assumptions.
   >
   > There are places where the URIs are not required to identify WebDAV
   > resources, but which are required to be usable to access the resource.
   > It believe it is appropriate and desireable to use "URL" for those
   > places (the DAV:source URIs are examples of such a place).

   I disagree that the source link MUST use "locatable" resources. We
   may want to clarify with Roy.

This is an excellent example of why we should carefully use URI,
URL, and URN in the spec.  The statement in section 13.10 of the spec
is potentially ambiguous.  In particular, it states: "there is
typically only one destination (dst) of the link, which is the URI
where the unprocessed source of the resource may be accessed."  I
interpreted the "typically" to just be a qualifier of "only one".  It
sounds like you interpreted "typically" to also be a qualifier of the
phrase following the comma (while I interpreted it as a definition of
"link").  I believe my interpretation is correct, because without
that, the source link is basically useless ... "here's is a URI of the
source, but you can't use it for anything of interest".

   Example: I might want to have a source link from
	   http://greenbytes.de/tech/webdav/rfc3253.html
   to
	   urn:ietf:rfc:3253

   Do you say this is wrong?

Yes, it is wrong if urn:ietf:rfc:3253 is not a URL (i.e. cannot be
used to apply methods such as GET and PUT to the resource that it
identifies).  But of course, urn:ietf:rfc:3253 easily could be a URL,
in which case it would be fine.

   >  "there is typically only one destination (dst) of the link,
   >   which is the URI where the unprocessed source of the resource
   >   may be accessed."

   This is just s statement about a typical use case. It doesn't say
anything
   about other use cases.

See above.  I consider "typically" to only qualify "one", not the
entire phrase following the comma as well.

   > If something can be "accessed" at that URI, that URI is a URL, as
   > URLs are defined in RFC-2396.  And therefore it would have been
   > preferable for section 13.10 to use the term "URL" here, and not "URI".

   So who defines whether a URI is a URN?

Whatever spec is defining the semantics of where that URI appears.  If
it is required to have the semantics of a URL, it is a URL.  If it is
required to have the semantics of a URN, it is a URN.  This gives the
client a vital piece of information about how it can use that URI.

   Is
	   urn:isbn:(some-isbn)
   a URN? Is it a URL?

Once again, although you can sometimes infer this from the scheme name
(i.e. a urn: scheme URI should always be a URN, and an http: scheme
URI should always be a URL), you can't infer the opposite, (i.e. a
particular urn: scheme URI might be a URL, and a particular http:
scheme URI might be a URN).

   If it *is* a URN (it clearly says so :-),

Note that it is RFC 2141 that says so, not the fact that "urn:"
happens to be the name of the URI scheme (although it would have
been a pretty poor choice for a URI scheme if the URIs in that
scheme weren't required to be URNs :-).

   does it become
   a URL as well if I install a custom URI resolver in my system which knows
   about the urn:isbn: scheme?

As Larry indicated, the concepts of a URL and URN have fuzzy
boundaries.  If something in theory would allow you to locate
something, but you don't have the locator software installed on your
machine, is it still a URL?  (I say, yes).  If something in theory
could be located, but no software has ever been written to locate it
using that URI scheme, is it a URL?  (I say, no).  But most cases are
clearer.  The point is, what can a client assume.  Can it assume that
it should be able to apply methods to the resource through that URI?
If so, it is a URL.  Can it assume that this will uniquely identify a
particular resource (for some sensible semantics definition of "a
resource")?  If so, it is a URN.

   I think the concept of *partitioning* URIs into
   "locatable" und "naming" is bogus -- both types can be used both ways.

Another strawman (actually, the same strawman being raised again :-).
Neither RFC-2396 nor I have ever maintained that the URI space is
*partitioned* into "locatable" (URL) and "naming" (URN).  In each of
my messages in this thread, I have emphasized the opposite, i.e. that
a URI can be a URL, a URN, both, or neither (and quoted from 2396
where it explicitly states that these concepts overlap).

   The joint IETF/W3C URI interest group is currently trying to
   clarify this, and we should not to rely on outdated definitions.

Are you referring to the "classical" and "contemporary" definitions
that appear in draft-mealling-uri-ig-02?  If so, I strongly recommend
that we stick to the definitions in 2396.  I found it a bit puzzling
that draft-mealling-uri-ig-02 chose to set up two (in my view,
bogus) strawman definitions, rather than just quoting from 2396,
which actually go it right (and had none of the confusion or problems
of the ones that they made up).

   > URL is used to access a resource, while a URN only guarantees
   > that it names that resource.  A particular URI can of course be

   URLs name resources as well. So are they a superset of URNs? Just
   wondering.

URLs do not "name" resources in the sense of the term as used
in URN.  For URNs "name" means "provides a globally unique name".
In general, something that "locates" a resource does not provide
a "globally unique name" for that resource, so, no, a URL does
not "name" a resource, and is not a superset of URNs.

   > both a URL and a URI, but for the purposes of the source property,
   > what matters is that it can be used to access the resource, and there
   > is no guarantee that it names the resource.

   URLs are URIs. URIs *identify* resources, so they provide
   *identifiers* for resources. Are you saying that an "identifier" is
   not a "name"?

I am using "locate" as it is defined for URL in 2396, and "name" as
it is defined for URN is 2396.  These are well defined terms in this
context (i.e. "locate" means "can be used to apply a method to the
resource identified by the URI", and "name" means "provides a globally
unique name for the resource identified by the URI").

   Resources never are *available* at a URI.

That will come as a surprise to the authors (and reviewers) of
RFC 2518, which in section 8.10.3 (:-).  In the section we have
been discussing, and which I have been quoting) states:
"A resource may be made available through more than one URI."

   *Representations* of resources may be available. I think you are
   saying that any URI that can be used to locate a representation of
   a resource is a URL.

I am just using terminology in the same way it is used in
2518.  One can certainly make the statement more detailed in
the way you have above, but I think the 2518 usage is fine.

   That may be technically correct (so *I* can basically turn every
   URN into a URL by implementing a resolver for the scheme).

And if such a resolver becomes so widespread that a client can
reasonably count on it being available, it is a URL.

   However, most people clearly have a different
   understanding of URLs. We should try to *avoid* additional
   confusion, instead of adding to it.

I think most people clearly have a different understanding of URLs
because most people don't bother to read the spec that defines the
term (:-).  If there is some problem with the current standardized
definition, then it can be fixed, but I believe the confusion
arises from all the various fuzzy non-standard definitions
(since the term is used by network TV anchormen, I am not surprised
that there is some confusion :-).

   > But we need to use the RFC-2396 definition of what is a URL,
   > and anything that can be used to access a resource is a URL.

   I don't think that RFC2396 says that.

That is how I interpret:

 "The term "Uniform Resource Locator" (URL) refers to the subset of
 URI that identify resources via a representation of their primary
 access mechanism (e.g., their network "location")"?

Note no reference to what their URI scheme may or may not be.

   For instance, "urn:isbn:"
   clearly is a URN scheme, but I still can use that URN to access a
   representation of the resource (the resource for instance being a
   book).

Because, as is explicitly stated by RFC 2396, a URI can be both a
URL and a URN.

   > In particular, whether or not something is a URL is not dependent
   > on its scheme, but is dependent on whether it provides the
   > ability to access the resource it identifies.

   How would a *URI* provide that ability? The URI itself is just a name.

Of course it is not the URI that provides that ability (a URI itself
has no characteristics other than its adherence to the URI syntax).
It is the semantics of the clients and servers that make a given URI
be either a URL or a URN.  This is clear from RFC 2396, which talks
about URLs in terms of "access mechanisms".

   For specific URI schemes, my computer may have software installed
   to resolve it, but that's it. RFC2396 characterizes a URL as a URI
   "that identify resources via a representation of their primary
   access mechanism (e.g., their network "location")", but that
   doesn't mean that a non-URL URI can't be used to locate the
   resource as well.

We can quibble over whether something is a URL if it provides a "secondary"
access mechanism instead of a "primary" access mechanism, but I see no
benefit in doing so.  The key aspect of a URL (that differentiates it
from a URN) is that it provides access to a resource (while a URN
provides a globally unique name for a resource).

   > We also should use URN in all cases where we *know* something is
   > a URN (such as a lock token).

   If you agree that a HTTP URL can be used to identify a lock token,
   thus it *is* both a URN and a URL, could you please give an example
   of a HTTP URL that doesn't share these properties (thus *isn't* a
   URN)?

Sure.  http://www.webdav.org/deltav/protocol/index.html.  I personally
have associated that URL with different resources over time, because
of various bad interactions with locking (at least, I think that was
the problem).  If that resource had a URN property, such as a
DAV:version-history, you would have been able to notice as much.
The same is true for most (probably all) resources on
http://www.webdav.org (although there are some links to some http
URNs in the http://www.ietf.org/ namespace.

Cheers,
Geoff
Received on Saturday, 13 July 2002 18:12:13 UTC