RE: Issue: URI_URL, proposed changes from Julian Reschke on 2002-07-14 (w3c-dist-auth@w3.org from July to September 2002)

From: Julian Reschke <julian.reschke@gmx.de>
Date: Sun, 14 Jul 2002 19:50:40 +0200
To: "Clemm, Geoff" <gclemm@rational.com>, "'Webdav WG \(E-mail\)'" <w3c-dist-auth@w3c.org>
Message-ID: <JIEGINCHMLABHJBIGKBCAEBNEPAA.julian.reschke@gmx.de>
> From: w3c-dist-auth-request@w3.org
> [mailto:w3c-dist-auth-request@w3.org]On Behalf Of Clemm, Geoff
> Sent: Sunday, July 14, 2002 5:45 PM
> To: 'Webdav WG (E-mail)'
> Subject: RE: Issue: URI_URL, proposed changes
>
>
>
> On the off chance that the details of this particular thread
> are probably becoming less interesting to anyone other than
> Julian and myself (:-), I'll just summarize my position:
>
> - We should adhere to the RFC-2396 definitions of URI, URL, and URN,
> since no problems with those definitions have been identified.  I do
> not consider the fact that many people don't know/use those
> definitions to be a problem, since the "commonly held" definitions
> incorrectly consider URL and URN as partitioning the URI space
> (i.e. problems have been identified with those "commonly held"
> definitions).  We could try to make up yet another definition (that
> is different from both RFC-2396 and the "commonly held" definitions)
> but I see no point in doing so.
>
> - We should use "URL" or "URN" in 2518bis whenever we are defining a URI
> that has the semantic properties of a URL or a URN (as defined by 2396).

I agree, as long as the wording makes clear what is meant.

> Specific responses to Julian's most recent comments embedded below.
>
> Cheers,
> Geoff
>
>
>    From: Julian Reschke [mailto:julian.reschke@gmx.de]
>
>    >    >    > URL-references).  We also have some URIs which name a
> resource
>    >    >    > (in particular, lock tokens).  These should be called URNs.
>    >    >
>    >    >    I disagree. URLs are just a subset of URIs.
>    >    >
>    >    > We don't disagree there.
>    >
>    >    RFC2518 has defined that lock tokens are identified by
> URIs with no
>    >    restriction.
>    >
>    > That is incorrect.  The first line of section 6.4 (that
> defines a lock
>    > token URI) states: "The opaquelocktoken URI scheme is designed to be
>    > unique across all resources for all time."  That makes an
>    > opaquelocktoken URI a URN, and disallows using non-URN URIs.
>
>    Wrong. Lock tokens are defined in section 6.3.
>
> It is true that I quoted the wrong section.  But if you read section
> 6.3, it says the same thing, i.e.: "Lock token URIs MUST be unique
> across all resources for all time."  That requires all lock token
> URIs to satisfy the definition of a URN, which defines each lock
> token URI to be a URN.

Yes.

>    Section 6.4 defines a particular URI scheme that *can* be used to
>    identify locks. It doesn't say that you need to use it to represent
>    lock tokens :"This specification provides a lock token URI scheme
>    called opaquelocktoken that meets the uniqueness
>    requirements. However resources are free to return any URI scheme
>    so long as it meets the uniqueness requirements."
>
> And "meeting the uniqueness requirements" makes it a URN.

Yes.

>    >    An attempt to restrict lock tokens to specific URI types (are
>    >    you trying to do that?)
>    >
>    > No, I have (several times :-) emphasized that whether or not a URI
>    > is a URL or URN does not require that the URI belong to a specific
>    > URI type.  A URI in any scheme can be a URL, if it can be used to
>    > apply a method to the resource identified by that URI.  A URI in any
>    > scheme can be a URN if it satisfies the semantics
> requirement of a URN.
>
>    I don't have any problem with that definition, however I'm sure
>    that 99% of the readers will have the same definitions of URL/URN
>    in mind when they read the spec, thus I fear unnecessary confusion.
>
> Then all we have to do is explicitly repeat this definition in the
> terminology section of the revised version of 2518, supplemented
> with a reference to 2396.

Fine with me.

>    > This is an excellent example of why we should carefully use URI,
>    > URL, and URN in the spec.  The statement in section 13.10 of the spec
>    > is potentially ambiguous.  In particular, it states: "there is
>    > typically only one destination (dst) of the link, which is the URI
>    > where the unprocessed source of the resource may be accessed."  I
>    > interpreted the "typically" to just be a qualifier of "only one".  It
>    > sounds like you interpreted "typically" to also be a qualifier of the
>    > phrase following the comma (while I interpreted it as a definition of
>    > "link").  I believe my interpretation is correct, because without
>    > that, the source link is basically useless ... "here's is a
> URI of the
>    > source, but you can't use it for anything of interest".
>
>    I have stopped thinking about the current wording for the
> source property
> a
>    long time ago. It doesn't make sense. It defines a resource property
>    (DAV:source), which contains the name of the resource itself as an
> element
>    (DAV:src). It just doesn't make sense (one of the reasons I raised the
> issue
>    in the issues document).
>
> I don't see that this needs to be a complicated issue.  The DAV:src
> field allows you to indicate that multiple resources depend
> on this link, to give the client an indication of all the other
> resources that would be affected by authoring the DAV:dst of that
> link.  For Web content management, an example of this would be a
> "template" resource from which a large number of other resources are
> derived.  In this particular example, the number of other resources is
> so large that I think it unlikely that a server would ever enumerate
> them in the DAV:src field.  Because I think this will commonly be the
> case, I'd be happy to deprecate the DAV:src field in the revision of
> 2518.  I personally would also like to rename the DAV:dst element to
> just be a DAV:href element (for interoperation with the
> DAV:expand-property REPORT), but that depends on whether any clients
> are really using DAV:link and therefore expects DAV:dst (current
> indications are that there are no such clients, so it would be
> safe to make this change).

Agreed.

>    >    Example: I might want to have a source link from
>    > 	   http://greenbytes.de/tech/webdav/rfc3253.html
>    >    to
>    > 	   urn:ietf:rfc:3253
>    >    Do you say this is wrong?
>    >
>    > Yes, it is wrong if urn:ietf:rfc:3253 is not a URL (i.e. cannot be
>    > used to apply methods such as GET and PUT to the resource that it
>    > identifies).  But of course, urn:ietf:rfc:3253 easily could be a URL,
>    > in which case it would be fine.
>
>    I disagree. "urn:ietf:rfc:3253" is a URN for a resource. Actually,
>    it is *the* URN endorsed by the IETF to identify the versioning
>    spec.  If I want to signal that a document depends on a source
>    which is a RFC, it completely makes sense to refer to it this way.
>
> I believe the central value of the DAV:source field is that it allows
> the client to perform methods on the resource identified by the
> DAV:dst field to cause changes to the content of the resource.  This
> value is not provided if the URI in the DAV:dst field is not a URL.
> Having a URI that is not a URL would not provide a mechanism for
> the client to effectively author the resource, which leads me to
> conclude that the spec should require the DAV:dst field to be a URL.

I don't buy that. You seem to argue that if a source for a document is not
authorable (which actually is the case in my example), then it's not worth
mentioning it as source. I think this is wrong.

>    BTW: Can you define what "applicable to methods such as GET and
>    PUT" means?  Is "mailto:" a URL scheme?
>
> It probably would be valuable to identify a minimal set of methods
> that SHOULD be applicable to the DAV:dst URL, in order to provide for
> interoperation.  Perhaps: "Either GET/PUT or PROPFIND/PROPPATCH SHOULD
> be supported on the DAV:dst URL".

It seems that you now require the URL to be either a http: or https: based
one. What about "mid:" or "ftp:"? These schemes define locations, but do not
have a concept of "methods".

See Roy's comment in
<http://lists.w3.org/Archives/Public/w3c-dist-auth/2002AprJun/0183.html>:

"That is not relevant.  The resources might just as easily be ftp or file
URLs, or might only be authorable by someone with authorization or coming
from a particular IP address.  Identifying the source does not imply
authorability on that resource, nor does it need to in order to be useful to
the user."

>    >    >  "there is typically only one destination (dst) of the link,
>    >    >   which is the URI where the unprocessed source of the resource
>    >    >   may be accessed."
>
>    ... it probably makes little sense to argue about this paragraph
>    unless we can find somebody who can explain what was intended. The
>    DAV:source property *as defined* doesn't make sense (see above).
>
> Do you just mean the lack of utility of the DAV:src element?  That's
> easily resolved by deprecating the DAV:src element.  Then we just
> rewrite that section as:
> "The destination (dst) of the link is the URL where an unprocessed
> source of the resource may be accessed.  Typically there is only one
> DAV:dst element, but there could be multiple if the resource content
> is derived from multiple inputs."

I disagree with the attempt to (now) require the source to be identified by
a URL. It doesn't help the client at all. Even if the server firmly believes
that a URI has "locator" quality, it can never be sure that the client
agrees (== has the same resolver software installed). Except of course it
happens to be an http: URL.

>    > could be located, but no software has ever been written to locate it
>    > using that URI scheme, is it a URL?  (I say, no).  But most cases are
>    > clearer.  The point is, what can a client assume.  Can it assume that
>    > it should be able to apply methods to the resource through that URI?
>
>    But what is this knowledge good for? Unless the client happens to
>    *know* the URI scheme, it won't be able to do anything useful with
>    the URI (other than use it as identifier), even if it knows it to
>    be a "locator".
>
> Of course a particular client may not have the software to access that
> URL, or the server that should be responsible for applying methods to
> that resource may not actually do so.  What we are trying to avoid is
> having implementors misunderstand the semantics of a particular
> field.  This
> gives the client and server writers essential guidance wrt what kind
> of URI they should allow in a particular field.  With the example above,

Given an arbitrary URI, what is the mechanism I can apply to it to find out
whether it has the URL-property?

> if we decide that the DAV:dst field should contain a URL, then a server
> should not store a urn: URI that they know is not commonly supported
> by clients and servers to apply methods to that resource.

I completely disagree, unless you can describe a sane algorithm that makes
this decision.

>    >    Resources never are *available* at a URI.
>    >
>    > That will come as a surprise to the authors (and reviewers) of
>    > RFC 2518 (:-).  In the section we have been discussing, and which
>    > I have been quoting) states: "A resource may be made available
>    > through more than one URI."
>
>    Another thing we may want to fix.
>
> If it ain't broke, don't fix it (and in this case, if it ain't broke,
> don't break it :-).
>
>    >    *Representations* of resources may be available. I think you are
>    >    saying that any URI that can be used to locate a representation of
>    >    a resource is a URL.
>    >
>    > I am just using terminology in the same way it is used in
>    > 2518.  One can certainly make the statement more detailed in
>    > the way you have above, but I think the 2518 usage is fine.
>
>    Probably not. People may get the impression that WebDAV somehow
> magicallly
>    allows to do something HTTP can't (and doesn't attempt to). HTTP's GET
>    retrieves *representations* of resources. WebDAV doesn't change that.
>
> But WebDAV isn't just talking about GET ... it also talking about
> PROPFIND and others.  Saying that a resource is available says that
> there are methods that can be applied to that URI that will not fail.

Maybe this definition should be added somewhere...

> In particular, some resources that are "available" do not support
> GET, so we are not talking about the "availability of the representation".

Such as? Just because the spec doesn't define the *specific* behaviour for
GET, this doesn't mean that GET should fail. Do you propose to fail an
attempt to GET a representation of a VHR? If so, what HTTP response code
would you propose? 204 (no content) seems to make sense, but it wouldn't be
a failure code, thus the VHR *would* support GET.

>    For instance, GET/PROPFIND followed by PUT/PROPPATCH is not
>    guaranteed to do the same thing as COPY (even when not taking
>    versioning/acl into account).
>
> "Availability" says nothing about the semantics of successive operations.
> It just says that a method can be applied to that URI and succeed.

So it can be any method? For instance, if GET on a URI returns 404, can I
say that the resource identified by "a" is "available", just because a PUT
on it would succeed?

>    >    That may be technically correct (so *I* can basically turn every
>    >    URN into a URL by implementing a resolver for the scheme).
>    >
>    > And if such a resolver becomes so widespread that a client can
>    > reasonably count on it being available, it is a URL.
>
>    Who is the authority I can ask whether a particular URI scheme can
>    "now" be considered a URL? Actually, if Microsoft decides not to
>    support "gopher:" anymore -- does the gopher URI loose it's
>    URL-property?
>
> Is indicated earlier, this is not a formally demonstrable property
> of a URI (such as its syntax).  It is a qualitative property, and there
> will be grey areas.  And yes, if a URI scheme is no longer widely
> implemented, the URIs in that scheme are no longer URLs.  And similarly,

So what happens to URLs using that scheme that happen to be already stored
in DAV:source properties? Would it be illegal for a server to keep them? How
would the server know that something it accepted in the past as valid URL
has lost it's URL property?

> if a urn: scheme is widely implemented to not provide globally unique
> naming, then a URI in that urn: scheme would not be a URN.  The concepts
> of URN and URI are worthless if they just refer to some syntactic
> notion (e.g. "has a scheme name that begins with the characters 'u',
> 'r', and 'n').

Yes. I didn't propose that interpretation.

>    >    If you agree that a HTTP URL can be used to identify a lock token,
>    >    thus it *is* both a URN and a URL, could you please give
> an example
>    >    of a HTTP URL that doesn't share these properties (thus *isn't* a
>    >    URN)?
>    >
>    > Sure.  http://www.webdav.org/deltav/protocol/index.html.  I
> personally
>    > have associated that URL with different resources over time, because
>    > of various bad interactions with locking (at least, I think that was
>    > the problem).  If that resource had a URN property, such as a
>    > DAV:version-history, you would have been able to notice as much.
>    > The same is true for most (probably all) resources on
>    > http://www.webdav.org (although there are some links to some http
>    > URNs in the http://www.ietf.org/ namespace.
>
>    Well, that's partly a question of how you define a resource. Note
>    that even though a HTTP resource is implemented inside a specific
>    HTTP server, that doesn't necessarily mean that it changes just
>    because internally the implementation of the resource changed.
>
> That is of course true.  Just like the concepts of a URN and URL, the
> concept of "object identity" has fuzzy boundaries (in all domains,
> not just IETF protocols).  As we introduce more advanced protocol
> funtionality (versioning, binding), these boundaries start to get
> firmed up though.  In particular, those protocol extensions make it
> clear that MOVE associates an existing resource (with a given identity)
> with a different URI, while COPY creates a new resource (whose identity
> is different from the resource that was the source of the COPY).

It will still depend on the point of view. From the client's perspective,
it's really irrelevant whether the internal implementation has changed, as
long as the abstract resource identified by the URI is the same.
Received on Sunday, 14 July 2002 13:50:50 UTC