RE: Using fragment identifiers with URNs from Stephen Cranefield on 2001-09-27 (uri@w3.org from September 2001)

From: Stephen Cranefield <SCranefield@infoscience.otago.ac.nz>
Date: Thu, 27 Sep 2001 16:50:23 +1200
To: "'uri@w3.org'" <uri@w3.org>
Message-ID: <B57613845A50D211864C0000F8FA5C280420761A@mars.otago.ac.nz>
Al Gilman wrote (in response to me):
> >  [I presume this last paragraph should really say "... when a URI
> >  is intended for retrieval ..." as a *URI reference* is 
> >  always intended for retrieval according to the definition in
> >  Section 4.1.]
>
> No, you are not following why it says that.  The #fragment 
> that appears in a URI-reference is not part of the URI.  So it
> can only say "A fragment identifier is only meaningful when a
> URI-reference..." because a fragment identifier is only _present_
> in the context of a URI-reference.  In the context of a URI, it
> simply isn't there.  You set aside the fragment identifier,
> and if you recover the resource associated with the URI that was 
> left on removal of #fragment, you can then, in the light of the
> type you ascertain from the recovered resource, use the #fragment
> in that context.

I understand that.  We are not in disagreement here.  To be even
more precise, I believe the sentence that I quoted from RFC2396
should say:

 A fragment identifier is only meaningful when the URI obtained by
 removing the '#' and fragment identifier is intended for retrieval
 and the result of that retrieval is a document for which the
 identified fragment is consistently defined.

This would make it clearer that the URI reference syntax is not
applicable to all URI schemes.

> >These paragraphs [from RFC2396] clearly imply that not all URI
> >schemes have a notion of retrieval defined.
> >
> No, they do not imply anything of the sort.  They allow it.

But allowing it is the same as implying it!  RFC2396 *defines* URIs
(doesn't it?) and it does this not by giving a list of the properties
of individual schemes, but by constraining what is possible.  If it
allows for URN schemes that have no notion of retrieval then the
(Platonic, for now) existence of such schemes is implied by the
specification.

> Functionality such as relative URI references is not 
> applicable to cid: and
> data: URLs.

The applicability and meaning of relative URI references is defined
individually for each scheme (although RFC2396 constrains the
type of things that this syntax can be used for).  Fragment IDs
seem to be treated differently.  In the definitions of URI schemes
that I have looked at (I admit I haven't done a definitive survey here)
the applicability or otherwise of fragment identifiers is not discussed.
The scheme specifications 'inherit' the meaning of fragment identifiers
defined in RFC2396, and the semantics of the MIME type of the retrieved
resource (which defines what the fragment actually identifies).  It is
this default 'inheritance' of the applicability of fragment identifiers
into all URI scheme specifications that seems at odds with the possibility
of having URNs that name abstract and non-retrievable concepts.

> The retrieval mechanism for http URLs is defined by
> >the HTTP protocol, not any particular application.  
> 
> AG:: Negatory.  Applications routinely recover these objects 
> from a search list of mechanisms starting with local cache and
> extending out through HTTP as required.  And other applications
> don't use such a cache.  The application knows that it can try
> HTTP and will often succeed, but that is custom and not
> definition.

Surely Internet and Web standards need to define Web
architecture at a certain level of abstraction - and to consider
application-specific mechanisms to be outside their ambit.  I had
presumed that the use of a cache was not evident to a client, but
I admit I am not familiar with all possible HTTP status codes.

Anyway, with the obvious caveat that an application can always
choose to do something application-specific with a URI, I think my
point still applies: URL schemes define an associated retrieval
mechanism.  If they didn't, they wouldn't be much use.  However,
URN schemes need not, and in the absence of such a mechanism it
would be subverting the intent of that scheme for an application
to implement a custom retrieval mechanism for that scheme.

> The wording of the specfication may fall short as far as 
> making the intent
> patently clear.
> 
> On the other hand it also falls far short of disproving what Roy
> said, which is IIRC what the intention of the group was as he was
> editing it on their behalf.

("IIRC" decoded thanks to
http://www.tuxedo.org/~esr/jargon/html/entry/IIRC.html)

I'm certainly not trying to disprove that Roy meant what he says he
meant when he edited the document!  In fact, being a newcomer to this
list, I didn't notice that Roy was an author until after I posted my
last message.  My aims in this discussion are to learn about some of
the assumptions underlying RFC2396 that aren't obvious to newcomers
ike myself, and also to advocate that there are valid uses for URNs
that do not have any semantics related to retrieval.  Unfortunately
there seem to a lot of implicit assumptions made in discussions about
URIs that these are always associated with retrieval, but I haven't
yet found a definitive statement that this is intrinsic to the notion
of a URI.

- Stephen
Received on Thursday, 27 September 2001 00:47:58 UTC