RE: Issue: URI_URL, proposed changes from Julian Reschke on 2002-07-13 (w3c-dist-auth@w3.org from July to September 2002)

From: Julian Reschke <julian.reschke@gmx.de>
Date: Sun, 14 Jul 2002 01:22:35 +0200
To: "Clemm, Geoff" <gclemm@rational.com>, "'Webdav WG \(E-mail\)'" <w3c-dist-auth@w3c.org>
Message-ID: <JIEGINCHMLABHJBIGKBCIEBHEPAA.julian.reschke@gmx.de>
> From: w3c-dist-auth-request@w3.org
> [mailto:w3c-dist-auth-request@w3.org]On Behalf Of Clemm, Geoff
> Sent: Sunday, July 14, 2002 12:12 AM
> To: 'Webdav WG (E-mail)'
> Subject: RE: Issue: URI_URL, proposed changes
>
>
>
>    From: Julian Reschke [mailto:julian.reschke@gmx.de]
>
>    > From:  Clemm, Geoff
>    >
>    > ... I (and RFC-2396) was making the opposite point, namely that
>    > two different resources cannot be identified by the same URN.  So
>    > if two resources have the same URN, you are guaranteed that they
>    > are the same resource.
>
>    I mis-read what you wrote, but I still disagree (now in a different
>    way). A URI (no matter whether URN or URL) identifies a
>    resource. So two different resources can't have the same URL
>    either. That's by definition.
>
> Sure they can.  Create a resource at /foo.html.  Create a resource at
> /bar.html.  These are separate resources.  MOVE /foo.html to
> /bar.html.  /bar.html now identifies a different resource.  That is
> the semantics of a URN ... if a URN identifies a given resource at
> time X, it also identifies that resource (or no resource at all) at
> time Y.

Ok. add "...at the same time".

>    >    > URL-references).  We also have some URIs which name a resource
>    >    > (in particular, lock tokens).  These should be called URNs.
>    >
>    >    I disagree. URLs are just a subset of URIs.
>    >
>    > We don't disagree there.
>
>    RFC2518 has defined that lock tokens are identified by URIs with no
>    restriction.
>
> That is incorrect.  The first line of section 6.4 (that defines a lock
> token URI) states: "The opaquelocktoken URI scheme is designed to be
> unique across all resources for all time."  That makes an
> opaquelocktoken URI a URN, and disallows using non-URN URIs.

Wrong. Lock tokens are defined in section 6.3. Section 6.4 defines a
particular URI scheme that *can* be used to identify locks. It doesn't say
that you need to use it to represent lock tokens :"This specification
provides a lock token URI scheme called opaquelocktoken that meets the
uniqueness requirements. However resources are free to return any URI scheme
so long as it meets the uniqueness requirements."

>    An attempt to restrict lock tokens to specific URI types (are
>    you trying to do that?)
>
> No, I have (several times :-) emphasized that whether or not a URI
> is a URL or URN does not require that the URI belong to a specific
> URI type.  A URI in any scheme can be a URL, if it can be used to
> apply a method to the resource identified by that URI.  A URI in any
> scheme can be a URN if it satisfies the semantics requirement of a URN.

I don't have any problem with that definition, however I'm sure that 99% of
the readers will have the same definitions of URL/URN in mind when they read
the spec, thus I fear unnecessary confusion.

>    >    When we talk about URIs that identify WebDAV resources,
> we can talk
>    >    about URLs (because we know that they use the http: or https:
>    >    scheme). Otherwise, we shouldn't make any assumptions.
>    >
>    > There are places where the URIs are not required to identify WebDAV
>    > resources, but which are required to be usable to access the
> resource.
>    > It believe it is appropriate and desireable to use "URL" for those
>    > places (the DAV:source URIs are examples of such a place).
>
>    I disagree that the source link MUST use "locatable" resources. We
>    may want to clarify with Roy.
>
> This is an excellent example of why we should carefully use URI,
> URL, and URN in the spec.  The statement in section 13.10 of the spec
> is potentially ambiguous.  In particular, it states: "there is
> typically only one destination (dst) of the link, which is the URI
> where the unprocessed source of the resource may be accessed."  I
> interpreted the "typically" to just be a qualifier of "only one".  It
> sounds like you interpreted "typically" to also be a qualifier of the
> phrase following the comma (while I interpreted it as a definition of
> "link").  I believe my interpretation is correct, because without
> that, the source link is basically useless ... "here's is a URI of the
> source, but you can't use it for anything of interest".

I have stopped thinking about the current wording for the source property a
long time ago. It doesn't make sense. It defines a resource property
(DAV:source), which contains the name of the resource itself as an element
(DAV:src). It just doesn't make sense (one of the reasons I raised the issue
in the issues document).

>    Example: I might want to have a source link from
> 	   http://greenbytes.de/tech/webdav/rfc3253.html
>    to
> 	   urn:ietf:rfc:3253
>
>    Do you say this is wrong?
>
> Yes, it is wrong if urn:ietf:rfc:3253 is not a URL (i.e. cannot be
> used to apply methods such as GET and PUT to the resource that it
> identifies).  But of course, urn:ietf:rfc:3253 easily could be a URL,
> in which case it would be fine.

I disagree. "urn:ietf:rfc:3253" is a URN for a resource. Actually, it is
*the* URN endorsed by the IETF to identify the versioning spec. If I want to
signal that a document depends on a source which is a RFC, it completely
makes sense to refer to it this way.

BTW: Can you define what "applicable to methods such as GET and PUT" means?
Is "mailto:" a URL scheme?

>    >  "there is typically only one destination (dst) of the link,
>    >   which is the URI where the unprocessed source of the resource
>    >   may be accessed."
>
>    This is just s statement about a typical use case. It doesn't say
> anything
>    about other use cases.
>
> See above.  I consider "typically" to only qualify "one", not the
> entire phrase following the comma as well.

I disagree, but it probably makes little sense to argue about this paragraph
unless we can find somebody who can explain what was intended. The
DAV:source property *as defined* doesn't make sense (see above).

>    > If something can be "accessed" at that URI, that URI is a URL, as
>    > URLs are defined in RFC-2396.  And therefore it would have been
>    > preferable for section 13.10 to use the term "URL" here, and
> not "URI".
>
>    So who defines whether a URI is a URN?
>
> Whatever spec is defining the semantics of where that URI appears.  If
> it is required to have the semantics of a URL, it is a URL.  If it is
> required to have the semantics of a URN, it is a URN.  This gives the
> client a vital piece of information about how it can use that URI.

I agree that it makes sense to give the client this piece of information. I
don't agree that just saying "URL" or "URN" will achieve this goal.

> As Larry indicated, the concepts of a URL and URN have fuzzy
> boundaries.  If something in theory would allow you to locate
> something, but you don't have the locator software installed on your
> machine, is it still a URL?  (I say, yes).  If something in theory

OK, can you give an example for a URN-typed URI which in *theory* can't be
located by some dictionary mechanism? It's not globally unique by magic --
it usually depends on naming authorities (DNS, phone numbers), dictionaries
(urn scheme, ISBN) and the like.

> could be located, but no software has ever been written to locate it
> using that URI scheme, is it a URL?  (I say, no).  But most cases are
> clearer.  The point is, what can a client assume.  Can it assume that
> it should be able to apply methods to the resource through that URI?

But what is this knowledge good for? Unless the client happens to *know* the
URI scheme, it won't be able to do anything useful with the URI (other than
use it as identifier), even if it knows it to be a "locator".

> If so, it is a URL.  Can it assume that this will uniquely identify a
> particular resource (for some sensible semantics definition of "a
> resource")?  If so, it is a URN.
>
>    I think the concept of *partitioning* URIs into
>    "locatable" und "naming" is bogus -- both types can be used both ways.
>
> Another strawman (actually, the same strawman being raised again :-).
> Neither RFC-2396 nor I have ever maintained that the URI space is
> *partitioned* into "locatable" (URL) and "naming" (URN).  In each of
> my messages in this thread, I have emphasized the opposite, i.e. that
> a URI can be a URL, a URN, both, or neither (and quoted from 2396
> where it explicitly states that these concepts overlap).

Yes. However, as the ID mentioned states, people *think* that there is a
partitioning. That's a problem that RFC2396bis should solve. We should stay
out of this business.

>    > URL is used to access a resource, while a URN only guarantees
>    > that it names that resource.  A particular URI can of course be
>
>    URLs name resources as well. So are they a superset of URNs? Just
>    wondering.
>
> URLs do not "name" resources in the sense of the term as used
> in URN.  For URNs "name" means "provides a globally unique name".
> In general, something that "locates" a resource does not provide
> a "globally unique name" for that resource, so, no, a URL does
> not "name" a resource, and is not a superset of URNs.
>
>    > both a URL and a URI, but for the purposes of the source property,
>    > what matters is that it can be used to access the resource, and there
>    > is no guarantee that it names the resource.
>
>    URLs are URIs. URIs *identify* resources, so they provide
>    *identifiers* for resources. Are you saying that an "identifier" is
>    not a "name"?
>
> I am using "locate" as it is defined for URL in 2396, and "name" as
> it is defined for URN is 2396.  These are well defined terms in this
> context (i.e. "locate" means "can be used to apply a method to the
> resource identified by the URI", and "name" means "provides a globally
> unique name for the resource identified by the URI").

Again: what's the difference between "naming" (URN) and "identifying" (URI).
Can you give an example of a URI that identifies, but not "names" a
resource?

>    Resources never are *available* at a URI.
>
> That will come as a surprise to the authors (and reviewers) of
> RFC 2518, which in section 8.10.3 (:-).  In the section we have
> been discussing, and which I have been quoting) states:
> "A resource may be made available through more than one URI."

Another thing we may want to fix.

>    *Representations* of resources may be available. I think you are
>    saying that any URI that can be used to locate a representation of
>    a resource is a URL.
>
> I am just using terminology in the same way it is used in
> 2518.  One can certainly make the statement more detailed in
> the way you have above, but I think the 2518 usage is fine.

Probably not. People may get the impression that WebDAV somehow magicallly
allows to do something HTTP can't (and doesn't attempt to). HTTP's GET
retrieves *representations* of resources. WebDAV doesn't change that. For
instance, GET/PROPFIND followed by PUT/PROPPATCH is not guaranteed to do the
same thing as COPY (even when not taking versioning/acl into account).

>    That may be technically correct (so *I* can basically turn every
>    URN into a URL by implementing a resolver for the scheme).
>
> And if such a resolver becomes so widespread that a client can
> reasonably count on it being available, it is a URL.

Who is the authority I can ask whether a particular URI scheme can "now" be
considered a URL? Actually, if Microsoft decides not to support "gopher:"
anymore -- does the gopher URI loose it's URL-property?

>    For specific URI schemes, my computer may have software installed
>    to resolve it, but that's it. RFC2396 characterizes a URL as a URI
>    "that identify resources via a representation of their primary
>    access mechanism (e.g., their network "location")", but that
>    doesn't mean that a non-URL URI can't be used to locate the
>    resource as well.
>
> We can quibble over whether something is a URL if it provides a
> "secondary"
> access mechanism instead of a "primary" access mechanism, but I see no
> benefit in doing so.  The key aspect of a URL (that differentiates it
> >from a URN) is that it provides access to a resource (while a URN
> provides a globally unique name for a resource).
>
>    > We also should use URN in all cases where we *know* something is
>    > a URN (such as a lock token).
>
>    If you agree that a HTTP URL can be used to identify a lock token,
>    thus it *is* both a URN and a URL, could you please give an example
>    of a HTTP URL that doesn't share these properties (thus *isn't* a
>    URN)?
>
> Sure.  http://www.webdav.org/deltav/protocol/index.html.  I personally
> have associated that URL with different resources over time, because
> of various bad interactions with locking (at least, I think that was
> the problem).  If that resource had a URN property, such as a
> DAV:version-history, you would have been able to notice as much.
> The same is true for most (probably all) resources on
> http://www.webdav.org (although there are some links to some http
> URNs in the http://www.ietf.org/ namespace.

Well, that's partly a question of how you define a resource. Note that even
though a HTTP resource is implemented inside a specific HTTP server, that
doesn't necessarily mean that it changes just because internally the
implementation of the resource changed. The author may have bound the name
to something completely different (making it a new resource internally), but
it might *still* represent the *same* resource as observable by clients. For
instance, just because somebody moves his stock ticker server from a ASP to
a PHP platform (new machines, new OS, new web server) doesn't mean that the
URL he assigned now denotes something else. It's just the implementation
that changed. (<http://www.w3.org/Provider/Style/URI.html>).

RFC2396: "The resource is the conceptual mapping to an entity or set of
entities, not necessarily the entity which corresponds to that mapping at
any particular instance in time. Thus, a resource can remain constant even
when its content---the entities to which it currently corresponds---changes
over time, provided that the conceptual mapping is not changed in the
process. "
Received on Saturday, 13 July 2002 19:23:14 UTC