RE: on documents and terms [was: RE: [WNET] new proposal WN URIs and related issues]

Hi Dan,

> From: Dan Connolly [mailto:connolly@w3.org] 
> > From: David Booth
> . . .
> > Similarly, if http://weather.example.com/oaxaca identifies a single 
> > resource that is "a periodically updated report on the weather in 
> > Oaxaca"[10], then I don't see how "all of [the] essential 
> > characteristics"[10] of that periodically updated report can be 
> > "conveyed in a message"[10].
> 
> Again, it seems to me that we do this routinely. Maybe it 
> takes more than one message and webarch is a bit sloppy here. 

But that is the crucial difference!  Sure, a *single* weather report can
be conveyed in a message.   But http://weather.example.com/oaxaca is not
merely identifying a *single* weather report issued at 2005-03-12
23:11:36.236 UTC or any other particular time.  It identifies a
*function* from time to weather reports.  I don't know any way to
transmit "all of [the] essential characteristics"[10] of that particular
function in a message or even a finite set of messages.

> 
> > Because "information resources" can return different 
> > "representations" 
> > at different times (even if some happen to return the same 
> > representation every time), it seems to me that "information 
> > resources" are by their very nature abstract.
> 
> Please be careful with your quantifiers. Your argument seems to go
> from:
>    There are some information that have more than one
>    representation and hence are abstract to
>    All information resources have more than one representation.

Almost.  My argument goes from "Some information resources have more
than one representation and hence are abstract" to "All information
resources are abstract".   Here is the justification.  (For clarity,
I'll avoid the term "abstract" below, and instead speak of "functions
from time to data", since that is more precise.)

1. Given: A URI identifies a *single* resource.

2. Any "information resource" that is intended to be time varying (such
as the "current weather report in Oaxaca") is obviously a function from
time to data, as illustrated above.  Thus, we know that some
"information resources" are functions from time to data.

3. For other "information resources" that are plain Web pages, if those
Web pages ever change, then those "information resource" must also be
functions from time to data.

4. The HTTP protocol and the URI resolution mechanism are such that the
content associated with a URI *always* has the *potential* of changing.
Thus, the content associated with a URI is *inherently* changeable over
time, even if by policy some Web pages are intended to remain constant.

5. I haven't a clue what utility there would be in calling something an
"information resource" if that thing is never ever intended to return
some data in a 2xx response to an HTTP GET.

Therefore, by Occam's Razor I conclude:

	All "information resources" are functions from time to data.

instead of:

	Some information resources are functions from time to data,
	while others might merely be constant data.

> . . . I don't think there's any (reasonable) 
> meaning of "words" where the TAG has decided that 
> w:InformationResource has no intersection with it.

If "frog" is a word (i.e., those four letters in sequence), and you
accept my conclusion above (that w:InformationResources are functions
from time to data), then "frog" cannot be a w:InformationResource
because it obviously is not a function from time to data.  It is merely
data.

> 
> On the contrary, I think the IETF has made it pretty
> clear that http://www.ietf.org/rfc/rfc822.txt has just
> one representation. And they haven't done anything to
> make the resource itself distinguishable from its 
> representation, so if they said the 2 are identical, that 
> would be coherent.
> 
> Likewise, W3C has bound the URI
>   http://www.w3.org/TR/2002/REC-xhtml1-20020801/DTD/xhtml1-strict.dtd
> to a particular sequence of bytes/characters.
> 
> 
> > Clearly the notion of an "information resource" is modeled 
> > after the 
> > real life notion of the contents of a (logical) disk 
> > region, on a Web 
> > server, that is associated with a URI "racine".  (The 
> > "racine" is all 
> > of the URI except the fragment identifier.[11])  The server is 
> > configured to return those contents, whatever they are, 
> > when the URI 
> > racine is dereferenced.  And those contents may change over time!  
> > Thus, the URI racine is not identifying any *particular* 
> > contents, it 
> > is identifying the logical *location* where those contents 
> > are stored, 
> > and the server provides whatever contents happen to be 
> > stored there at 
> > the moment they are requested.
> 
> Yes, but W3C and the IETF promise that some parts of our 
> disks won't change.
> 
> > In fact, it is not even possible on the Web to create a URI that is 
> > permanently bound to a single document instance that can 
> > never change:
> 
> I gave 2 counter-examples above.

No, you gave examples of URIs that are bound to content that, by today's
policy, is not *intended* to change.  The fact is, the content *can* be
changed, even intentionally, by the owners.

> 
> > it is *always* possible to change the server configuration 
> > or domain 
> > IP mapping to cause a different document instance to be served.
> 
> That would be a bug, in the 2 cases above.

What I mean was, if the domain owners' policies change, then the
documents may be changed *intentionally*.  That's a feature, not a bug.

> 
> >   In other
> > words, an http URI on the real Web identifies a logical *location* 
> > whose content *always* has the potential of changing.
> 
> I don't agree.

I don't understand how this statement could be subject to dispute.  Can
you explain?

David Booth

Received on Thursday, 4 May 2006 05:07:11 UTC