Re: Summary: Section 2: What does a URI identify? from Brian McBride on 2002-03-16 (www-tag@w3.org from March 2002)

From: Brian McBride <bwm@hplb.hpl.hp.com>
Date: Sat, 16 Mar 2002 13:23:38 +0000
To: Norman Walsh <Norman.Walsh@sun.com>, www-tag@w3.org, skw@hplb.hpl.hp.com
Message-Id: <5.1.0.14.0.20020316105236.00aa1b40@0-mail-1.hpl.hp.com>
At 12:57 15/03/2002 -0500, Norman Walsh wrote:
>At a recent telcon, Stuart Williams and I agreed to publish our one
>page summary of section 2 of the architecture document this week. We
>are aware of a few comments that have not been addressed yet, and I
>expect this publication will generate a whole lot more, so please
>remember that this is a work in progress. (In fact, discussion of this
>document is on the agenda for the *next* TAG meeting, so this document
>cannot even purport to represent the consensus of the TAG :-).
>
>   http://www.w3.org/2001/tag/doc/identify.html
>
>                                         Be seeing you,
>                                           norm

I'm really glad to see the tag taking a look at this issue.  RDFCore has 
some dependencies on the outcome, so I hope to follow this discussion with 
interest.


Some comments:

[[
2 What Does a URI Identify?

On the web URIs identify resources."Any information that can be named can 
be a resource." [RFC2396]. In fact, this relationship can be taken as 
axiomatic: if a resource has a URI, it is identifiable on the web. If it 
does not, it is not.
]]

I cannot find the quoted text ("Any information ...") in RFC 2396.

[[

2.2 Resources
...
The set of values mapped by a resource are equivalent resource 
representations and/or resource identifiers (giving further indirection or 
redirection). Dereferencing a resource identifier yields a representation 
of the current value of the referenced resource. At some time, t, the set 
of values that a resource maps to may be empty, which allows a concept to 
be identified before a realisation of the concept exists (or indeed after 
it has been retired).
]]

What notion of equivalence is meant here.  How can I determine whether two 
values are equivalent?  This  para talks about "the current value".  Is 
this the same sense of the term 'value' used in "At some time, t, the set 
of values that ..." or is there some notion that a resource has state, and 
it is the value of that state that is referred to?

[[
RDF provides the ability to described resources by their relationship to 
one another which leads to the notion of existentally qualified resources. 
For example, there exists a person whose internet mailbox is identified by 
the URI mailto:timbl@w3.org. This identifies the person of Tim Berners-Lee 
by reference to the URI of his internet mailbox without it being necessary 
to assign a URI to identify the concept of the person Tim Berners-Lee.
]]

It is not the resource that is existentially qualified.  RDF has the notion 
of a b-node which performs a role similar to that of existentially 
qualified variables in first order logic.  Just as in:

   x + 3 = 4

x is not the number 1, x is a variable, so b-nodes in RDF are not 
resources, they are variables.  Any of the values a b-node can take can be 
assigned a URI.

[[
2.3 Properties of Resources
...

Two different URI's may identify the same resource, but it is only the 
authorities that asssign those URIs that can make the commitment to them 
identifying the same resource.
]]

Can they?  Is that  a proposal?

The alternative notion, is that each different URI denotes a different 
resource, and to define a notion of equivalence between 
resources.  Different notions of equivalence are possible;  resources A and 
B denote the same set of values at time t, for a set of time intervals 
{[t1,t2]} or over all time.

Consider for example, http://www.w3.org/.  This web page is mirrored; I 
don't know what the url's of the mirrors are; lets say 
http://www.w3.inria.fr/ is one for the purpose of discussion.  There is 
presumably a propagation delay between updating the master version and that 
change propagating, so there is a period when an HTTP GET on the two 
different URL's will return different values.  Does this mean that these 
two URI's denote different resources, or is it that the implementation is 
an imperfect realization of the ideal.

More importantly, how can we know that these two URL's will always denote 
the same set of values.  We cannot predict the future.  The French 
government could choose next month, to require that all web pages served 
from French web servers contain some metadata which depends on the origin 
of the page.  How can we say today, that two URL's will, for all time, 
denote the same mapping to values.

[[
We are dealing here with two time dependent mappings. Firstly a time 
dependent mapping between and identifier and a resource ...
]]]

Oh that's horrible!  Later in the document it states:

[[
An absolute URI always means the same thing, regardless of the context in 
which it occurs.
]]

and

[[
The resource identified by a particular URI should always be "the same", 
when it is identified by that URI.
]]

That seems a little contradictory.

[[3.1 What about Fragment Identifiers?

If a URI contains an sharp character (a " # "), the string that follows the 
" # " is a fragment identifier. Fragment identifiers are a mechanism for 
identifying part of a resource.
]]

Are resources atomic, or can the parts of a resource also be resources?

[[
This means that in general, it's not possible to determine what a fragment 
identifier means without retreiving the resource into which it points.
]]

This sentence uses the term 'means' which is rather ill defined here.

If this sentence is trying to say that it is not possible to determine the 
bytes which represent the fragment without retrieving a representation of 
the whole resource, then that is true given current web practise.  But if 
that is the sense in which the word 'means' is used here, then it is also 
not possible to determine what http://www.w3.org/  *means* without 
retrieving it.

[[
The fragment identifier identifies some sub-part of a resource representation.
]]

I don't follow this.  Consider the resource identified by 
http://example.org/doc/.  Consider that there are two representations of 
this resource, one in say xhtml and the other in svg, and that each 
contains a fragment '#chapter1'.  Can we not say that 
http://example.org/doc/#chapter1 names chapter one of the document?  Can we 
not say that http://example.org/doc/#chapter1 names a resource, and that to 
display that resource a browser has to retrieve the resource 
http://example.org/doc/ and then interpret the value returned in a way that 
is dependent on the mimetype to compute the representation of chapter 1.

The fact that computing the representation of a fragment is mimetype 
dependent, does not mean that a URI with a fragment identifier cannot name 
an abstraction which has multiple representations with different mime-types.

[[
A URI that consists of only a fragment identifier (i.e, one that begins 
with a " # ") always points into the document that contains the URI, 
irrespective of the effective base URI.
]]

This statement is presumably based on RFC 2396:

[[
4.2. Same-document References

    A URI reference that does not contain a URI is a reference to the
    current document.  In other words, an empty URI reference within a
    document is interpreted as a reference to the start of that document,
    and a reference containing only a fragment identifier is a reference
    to the identified fragment of that document.
]]

However, there is an escape clause.  The same paragraph goes on to say:

[[
4.2. Same-document References
[...]
    However, if the URI reference occurs in a context that is always
    intended to result in a new request, as in the case of HTML's FORM
    element, then an empty URI reference represents the base URI of the
    current document and should be replaced by that URI when transformed
    into a request.
]]

Brian
Received on Saturday, 16 March 2002 08:25:41 UTC