Re: Are URI-References bound to resources? from Dan Connolly on 2001-05-14 (uri@w3.org from May 2001)

From: Dan Connolly <connolly@w3.org>
Date: Mon, 14 May 2001 01:31:18 -0500
To: "Roy T. Fielding" <fielding@ebuilt.com>
CC: Aaron Swartz <aswartz@upclink.com>, uri@w3.org
Message-ID: <3AFF7BB6.7C28B36B@w3.org>
"Roy T. Fielding" wrote:
[...]
> > > > > This seems to imply that URI references (that is, URIs with fragment
> > > > > identifiers) are not bound to a resource themselves.
> > > >
> > > > Careful... it does not imply that URI references are
> > > > bound to resources; but nor does it imply that they are *not*
> > > > bound to resources. RFC2396 is silent on what
> > > > a URI reference is bound to.
> > >
> > > Eh?  It is a resource identifier. It identifies resources.
> >
> > The question is about the term "URI reference". You're saying
> > that a URI reference identifies a resource? That's perhaps
> > consistent with the RFC, but I don't see it stated in
> > the RFC.
> 
> Ooops, missed that.  RFC 2396 treats those as two separate identifiers
> within the same syntactical protocol element.  I could have sworn that was
> in the spec.

Sure; but now let's take the two separate identifiers and look
at them as one. For example, In HTML 2.0, it was called an
"anchor address" and it was said to refer to an "anchor":

[[[
A hyperlink is a relationship between two anchors, called the head and
the
tail of the hyperlink[DEXTER]. Anchors are identified by an anchor
address:
an absolute Uniform Resource Identifier (URI), optionally followed by a
'#' and
a sequence of characters called a fragment identifier. For example: 

]]]

--        Hypertext Markup Language - 2.0 - Hyperlinks
http://www.w3.org/MarkUp/html-spec/html-spec_7.html#SEC7
Wed, 18 Oct 1995 06:29:36 GMT

The editors of the RDF spec chose to use the term 'resource'
for not only those things that RFC2395 calls 'resource', but
for the whole set of things that the HTML 2 spec
calls 'anchor'. That's not wrong; it's just different.


> > Leaving aside the fact that a relative URI reference needs
> > a bit of context (a base URI) to identify a resource,
> > I'm fine with this view of things; in fact, I think it's
> > the view used in the RDF spec; but you seem to think
> > otherwise; I'm confused...
> 
> The RDF spec refers to the representation as the resource -- in other words,
> that the fragment itself is a resource.

No, when the RDF spec says that
http://www.w3.org/2000/01/rdf-schema#label 
identifies a resource, it's not talking about some particular fragment
of a particular representation of the (state of the)
resource identified by http://www.w3.org/2000/01/rdf-schema ,
any more than http://www.w3.org/2000/01/rdf-schema identifies
just one piece of content. http://www.w3.org/2000/01/rdf-schema#label
identifies something that you can learn about using HTTP;
you can learn that it's a relationship between something
and a (string) label for that something.

> > > > > Instead, the only
> > > > > resource involved is that of the absolute URI itself.
> > > > >
> > > > > Is this interpretation correct?
> > > >
> > > > I don't think so; I think you're reading more into RFC2396
> > > > than is there. (you're certainly not the first, and
> > > > I don't expect you'll be the last.)
> > >
> > > No, that is correct.
> >
> > The interpretation he's asking about says that
> >   http://example.org/q#foo
> >     and
> >   http://example.org/q#bar
> > identify the same resource. Are you sure that's correct?
> > i.e. can you justify that from the text of RFC2396?
> > As far as I can tell, RFC2396 doesn't say what resource
> > either of them identifies; it only tells you that
> > yes, they identify the same resource after you remove
> > the #fragid. Duh.
> 
> The resource identifier is before the fragment.

No, *a* resource identifier is before the fragment.

If you extend the definition of the term resource (in a consistent way),
another resource identifier is the whole thing: http://example.org/q#foo
.

>  The fragment identifies
> a subset of the representation retrieved after a GET of the resource.

That's one way to learn something about that identifier; there
are other ways, too; you can get statements about it from
other documents, etc.

> It isn't possible to "access" the fragment on its own, and hence it is
> not itself a resource

Well, that's a limited view of the term resource. But it can
be extended (again, in a consistent fashion) to include more
things, like properties, classes, numbers, people, noses,
pink elephants, and so on.

> until someone provides another URI that that
> has the semantics of mapping to that representation as a resource
> (i.e., an indirect reference).

Er... that's like saying "42 isn't a number until somebody
writes down 6 and 7 and multiplies them." 42 always was
a number. http://www.w3.org/2000/01/rdf-schema#label always
was a resource (in the RDF sense).

[there are some limits to what is a resource... e.g.
things like the set of all sets that aren't members of themselves;
The RDF/RDFS specs do lack answers to some questions in
that neightborhood, but I think folks will agree that's
a separate issue.]


> > >  At no time whatsoever is the resource transferred
> > > across the network when doing a GET.
> >
> > Quite; who/what suggested it was?
> 
> That's what the definition you quoted said when it referred to the fragment
> as a resource.

Read it again; it doesn't say that. The XLink folks didn't go there.
http://www.w3.org/TR/2000/PR-xlink-20001220/#N789


> > >  Only a REPRESENTATION of that resource
> > > is transferred, and the fragment refers to a target within the representation
> > > and not within the resource.  That is why fragments are media-type specific.
> >
> > Yes, but again, I don't see how that's directly relevant;
> > the question is what
> >   http://example.org/q#foo
> > is bound to; and in particular, whether it's reasonable
> > to say that it's bound to a resource.
> 
> Now you lost me.  What does "bound" mean?

For "X is bound to Y" read: "X identifies Y".
(in other words: "X denotes Y")

>  I would say that the namespace
> is bound according to the authority example.org.  Since example.org is never
> given the opportunity to interpret #foo,

Interpret? huh? Surely they're given the opportunity to say
what http://example.org/q#foo means by putting a document
at http://example.org/q , no?

> that part of the URI reference
> isn't within example.org's authority, and hence it is not a resource.

Er...  the question that started this thread
is asked in the context of not only
RFC2396, where 'resource' is only used to mean things
identified by absolute URIs, but *also* in the context of
the RDF 1.0 spec, where 'resource' is also used to mean things
identified by absolute-URIs-with-fragment-identifiers.

So it's not very interesting to answer "... hence it is
not a resource in the RFC2396 sense of the word."


> > > > > If so, it would have serious consequences
> > > > > for many RDF specifications.
> > > >
> > > > RDF isn't the only spec that extends the domain
> > > > of the URI->resource mapping; XLink does too--
> > > > er... it did in earlier drafts... I pointed
> > > > that out in a review comment... during last call,
> > > > I think; they seem to have changed their mind since then...
> > > >
> > > > [[[
> > > > The notion of resources is universal to the World Wide Web. [Definition:
> > > > As
> > > > discussed in [IETF RFC 2396], a resource is any addressable unit of
> > > > information or service.] Examples include files, images, documents,
> > > > programs, and query results. The means used for addressing a resource is
> > > > a URI (Uniform Resource Identifier) reference (described more in 5.4
> > > > Locator Attribute (href)). It is possible to address a portion of a
> > > > resource.
> > > > ]]]
> > >
> > > That definition is wrong.
> >
> > Really? how so? It just says "the term resource in this
> > spec is imported from RFC2396". How could it be wrong?
> 
> "It is possible to address a portion of a resource."  That is wrong.

No, it's not wrong; it's just different.

> It is possible to reference a portion of a representation.

Just as it makes
sense to say (as RFC2396 does) that an absolute
URI identifies a resource that you don't directly observe... you
only indirectly observe its state by bouncing messages off it;
in the same way, it makes sense to say (RDF 1.0 does) that an
absolute-URI-with-fragment-identifier identifies
a resource that you don't directly observe... you only indirectly
observe it by looking at statements that mention it.

Note that HTML 2.0 also talked about addressing portions
of resources.

>  Besides that,
> why does it use the term "addressable" in the first place?  Addressable
> implies locator which defines URLs, not URI.

I chalk that up to editorial discretion; I don't infer any
distinction between "addressing a resource" and "identifying
a resource". It seems to be a pay-me-now (convince
the XLink folks that they should say "identify") or
pay-me-later (convince other folks that "address" means
the same thing, when talking about web resources, as "identify)
sort of thing.

>  If it is going to refer to
> the RFC 2396 definition, then it should use the RFC 2396 definition.

I can't argue with that.

> > >  So are the definitions used by RDF.
> >
> > Really? How so? which text in the RDF spec conflicts
> > with which text in RFC2396?
> 
> This text:
> 
>  Resources
>     All things being described by RDF expressions are called resources.
>     A resource may be an entire Web page; such as the HTML document
>     "http://www.w3.org/Overview.html" for example. A resource may be a part
>     of a Web page; e.g. a specific HTML or XML element within the document
>     source. A resource may also be a whole collection of pages; e.g. an
>     entire Web site. A resource may also be an object that is not directly
>     accessible via the Web; e.g. a printed book. Resources are always named
>     by URIs plus optional anchor ids (see [URI]). Anything can have a URI;
>     the extensibility of URIs allows the introduction of identifiers for
>     any entity imaginable.
>
> and it later proceeds to be clueless about the semantic differences between
> resources (mappings/bindings/names) and representations (the stuff you
> might get as a response when accessing the resource).

I don't think so; it's conventional, though somewhat anachronistic,
to speak of "the HTML document http://www.w3.org/Overview.html"
even though that URI actually identifies a resource that has different
representations over time, and even though it's not the
name (URI) that tells you it's HTML, but all the responses
you get back that say Content-Type: text/html.

Perhaps it's time to do away with such anachronisms. But using
old-speak is not necessarily symptomatic of cluelessness.

[by the way... what happened to 'entity' and 'entity body'?
Representation is pretty much synonymous with those, no?]

> > >  That's why they keep having semantic problems with the Web.
> >
> > Who is "they"? What problems?
> 
> The folks on this list who have several times requested that "resource"
> be redefined to be some subset of the Web's resources in order to solve
> some logic problem in RDF.  You'll have to search the archives for the
> specifics.

Well, I prefer to just trust you that some folks have made
such requests, and join you in dismissing them out of hand.

The RDF spec extends the definition of 'resource' from RFC2396,
but it does so in a compatible way. There's no reason
to change RFC2396. (There may be a reason to change the RDF
spec, but it's an editorial matter; technically, the definitions
work just fine.)

> > > > Keep in mind that RDF 1.0 was finished Feb 1999,
> > > > just a few months after RFC2396 in Aug 1998. It would
> > > > seem perfectly reasonable to do an editorial revision
> > > > of RDF to make a new term for what RDF 1.0 calls 'resource',
> > > > and use 'resource' to mean just what RFC2396 defines
> > > > it to mean.
> > >
> > > Dude, I finished the definition of resource a long time before RFC 2396
> > > was published -- it was in the November 1997 draft.
> >
> > You finished *your* definition of 'resource' a long time ago.
> > Getting that to be *the* definition took quite a while longer...
> > i.e. getting folks to read it, understand it, agree to it,
> > etc.
> 
> Not my problem -- it was on a public forum, representing the future of the
> most important aspect of the Web architecture, and the fact that some folks
> at the W3C didn't bother to keep apprised of the definition while they
> were supposedly defining the uber-meta-model for describing resources
> isn't something you should be defending.

Well, you can stick your head in the sand about the difference
between words on paper and understanding in the community,
but I'd rather you didn't.
I didn't think I was defending anything. I was just observing
that dissemination of technical information in the community
takes time. Some folks at the W3C *did* keep apprised. But
to expect everybody in every W3C working group to read
every draft is either arrogant or silly. Waiting until
something is an RFC is an entirely reasonable position
for most of the community to take.

> > >  And even that was
> > > just a rewording of the "HTTP object model" (now called the REST architectural
> > > style) that I was working on back in 1995 during our whiteboard discussions
> > > at the W3C.
> >
> > I agree; I don't think anything has changed in substance for
> > about 10 years.
> 
> Not in URI land, hopefully.  It doesn't mean that the old specifications
> were right.  Tim's design documents still refer to resources as documents.
> That was true prior to 1993, but not after 1993.

It didn't become false to say that http://www.w3.org/
is a document, if you take 'document' in the smalltalk/objective-C
sense, where it means something that has
mutable state and knows how to 'paint' itself in multiple formats;
that usage still makes sense; it's just not endorsed as an IETF
draft standard.

>  My goal was to update
> the semantics so that they reflected what was actually working on the Web,
> without losing the fundamental principles that Tim described to me at W3C.

I don't think the semantics changed along the way; we just
changed "Universal Document Identifier" to "Uniform Resource Identifer"
without making any technical changes. The libwww code from 1990
or so still runs just fine.


> > > > TimBL went that way in some code he wrote recently;
> > > > he uses 'Thing' for the class of things that includes
> > > > resources *and* things denoted by absolute-uris-with-fragments:
> > >
> > > Just shoot me.  Its my fault for not sending you guys a copy of my
> > > dissertation when I was writing it.  I didn't realize that we had diverged
> > > so far from a common model.
> >
> > Chill, Roy. Show me the text on which you're basing this
> > conclusion. I think you're being hasty. I don't see
> > divergence; I see two views:
> 
> No worries, I wasn't upset with you -- just me.
> 
> > I. The conservative view:
> >
> > There's a mapping
> >       Absolute URI -> Resource
> > Since http://example.org/q#foo isn't an absolute
> > URI, RFC2396 doesn't say what it identifies; folks
> > should use some other term for what it identifies.
> > (This is the view used in TimBL's recent hack
> > with the Thing class, probably because folks were
> > uncomfortable with ...)
> 
> Yeah, but I worked out a definition of that "thing" during the HTTP/1.1
> stuff.  I call it a representation.

Nope. See above starting with "but now let's take the
two separate identifiers and look at them as one."

> > II. The original/liberal view
> >
> > There's a mapping
> >       Absolute URI, with optional fragment identifier -> Resource
> > RFC2396 only explicitly talks about the part of that
> > mapping where the domain is an Absolute URI with no fragment
> > identifier, but RFC1630, among other specs, talks
> > about the whole mapping.
> 
> Right, and we abandoned it in 1995 because that definition caused
> all sorts of ambiguities in the specs.

No, not ambiguities... folks were uncomfortable with
the level of extensibility, so they took the general
model and introduced arbitrary limitations based
on which parts had been implemented. Somewhat
like denying that there are arbitrarily large
integers because we'd only seen 32bit registers
to date. Sigh.

>  For example, how is access
> control assigned to "things" on the Web.  By the resource.

Well, one could also say it's by the message; after all,
a GET access policy is likely to be very different from a PUT access
policy in most cases; and it's not uncommon to change
the access policy of a resource over time.

Neither view is in any ratified spec, as far as I know;
i.e. the ratified definition of 'resource' doesn't include
any notion of access control (did I miss something?)

>  Is it possible
> to define separate access control to different fragments of the same
> resource?  No.

[not to disagree, but
I presume you mean "is it practical to implement/deploy...?"
I can trivially define it: all accesses are granted.]

>  Therefore, a fragment is not a resource until it is
> bound as some other URI by a naming authority that can control access
> to the fragments as separate, identifiable resources.

That's one view. The RDF spec takes another view. Note that
the view taken by TimBL's cwm.py code is this "a fragment
is not a resource" view; he's taken the view that
it's too confusing to have two different definitions
of 'resource' around, and that the RDF specs (or at least: his
RDF code) should use 'Thing' for the general case
of something that's either an RFC2396-resource or
something-identified-by-an-absolute-URI-with-fragment-identifier.

>  Because if we
> decided the other way -- that a fragment was a resource too -- then we'd
> have to define a new term for that subset of "old-style resources" that
> were actually subject to the Web behavioral model.
>
> The same logic applies to many other aspects of the Web design beyond
> access control.  That's why I separated the definition of resource
> and representation, and hence why REST is an acronym for representational
> state transfer.  I needed to do that for HTTP/1.1 because the old model
> just didn't fit things like CGI, Apache modules, and URN indirection.

I agree that there's a distinction between RFC2396-resource
and MIME-entity-body. But there's also another distinction:
between Things and Resources.


-- 
Dan Connolly, W3C http://www.w3.org/People/Connolly/
Received on Monday, 14 May 2001 02:31:33 UTC