RE: Terminology (was Re: article on URIs, is this material that can be used by the) from Pat Hayes on 2007-07-06 (www-tag@w3.org from July 2007)

From: Pat Hayes <phayes@ihmc.us>
Date: Fri, 6 Jul 2007 12:35:39 -0500
To: "Rhys Lewis" <rhys@volantis.com>
Cc: <noah_mendelsohn@us.ibm.com>, "Dan Brickley" <danbri@danbri.org>, "Henry S. Thompson" <ht@inf.ed.ac.uk>, "Tim Berners-Lee" <timbl@w3.org>, <www-tag@w3.org>
Message-Id: <p0623090ac2b41ba4e3ce@[10.100.0.9]>
>
>Hello Pat,
>
>I'd like to ask my first dumb questions, if I may.

Im sure they won't be dumb.

>They concern your
>recent response to Noah's comments.
>
>Noah wrote:
>"Here's what I think may be the essence of the confusion:  there are
>certain systems in which it is by definition possible to attempt to
>access anything that can be referenced."
>
>You responded:
>"Indeed this may well be part of the confusion. OK, there are a few such
>systems (a VERY few), but the Web is not one of them; or at least, not if
>its understood as described in the Web Architecture document. That
>document is at pains to explain, quite early, that resources identified by
>URIs can be physical things not connected in any way to the Internet, such
>as books and people. From that point on, all talk about attempting to
>access things that can be referenced is obviously crazy. You can't use
>HTTP or any other xxTP to access people and books. (You can maybe access
>some kind of description of them, which could be called a representation
>of them, although not in the same sense of "representation" used by you
>and the architecture document; and not by getting a TP poke to them and
>causing them to emit a representation in response. But you didn't say
>'representation': you said, access anyTHING that can be referenced.)"
>
>My question concerns whether I've interpreted your response correctly. I
>was somewhat surprised by your apparent assertion that the Web is not a
>system which Noah characterised as one 'in which by definition it is
>possible to ATTEMPT (my emphasis) to access anything that can be
>referenced'.

Before reading on, let me emphasize that I am taking Noah's words 
here at their face value. That is, the situation as he describes it 
is one where (1) there is a URI (2) it is known that this URI is 
intended to reference, say, a person or a book (a "non-information 
resource"), and (3) it makes sense to attempt to access this person 
or book ("the thing referenced"). But I know, without further ado, 
that it does not make sense even to attempt to access a person or 
book (or galaxy, ..etc.).

>Let me first try and explain my surprise.
>
>It seems to me that the Web has always had the property that you can
>attempt to access that which a URI identifies (I'm assuming HTTP
>throughout this), but there is no guarantee that the access operation will
>be successful.

You can attempt to access >something<, yes. But whether or not that 
thing that you attempt to access is (or can possibly be) what the 
URI >refers to< is an open question. (I really, honestly, do not know 
what is meant by the word 'identify' in this context: it seems to 
change its meaning at almost every usage.) One thing we can be 
certain of, however, is that >if< the things that can be referred to 
include books, people and galaxies, then there >must< be cases where 
it does not make sense to attempt to access what is being referred to.

>Indeed, there are no guarantees about what might be
>returned in response to a successful access operation. This is a
>consequence of the localisation, of the definitive information about that
>which a URI identifies, to the resource itself. It is, I believe, in large
>part the source of the Web's scalability.

I agree. Long live 404 errors.

>Clearly, it would be strange to try and access something via the Web which
>was known to be 'physical' and hence not able to emit anything in response
>to a transport protocol 'poke'.

Quite.

>However, in general, the Web doesn't
>provide any guaranteed way to know that in advance.

Ah, but the semantic web does. If someone asserts

http://ex:thingie rdf:type dc:author .

and I have good reason to believe it, then I have good reason to 
believe that 'http://ex:thingie' denotes a person.

>It's not possible to
>know whether that which a URI identifies will respond to such a 'poke', by
>returning some material, without actually trying it.

Wait. It isn't possible to know what you will get back by poking a 
URI without trying the poke and seeing. Yes, true. But as to whether 
what you are poking is the >referent< of the URI, that is a different 
question. We know (see above) there must be cases where it cannot 
possibly be, because what the URI refers to simply isn't the kind of 
thing that can get poked in the required way.

So yes, the only way to tell what the accessible thing (if there is 
one) does is to poke it and see; but that is irrelevant to questions 
of reference. Neither the thing poked nor what it sends back in 
return need be what the URI >denotes< or >refers to<. This is one 
good reason why we need to distinguish between what you get when you 
poke with it, and what it refers to. They need not be the same. I 
think that if the SWeb ever takes off (or begins to slide downhill, 
using Tim's bobsled metaphor) then *most* URIs will be like this.

>Now, I agree with you that it could well be described as crazy to make
>subsequent attempts to access something via the Web that you already know
>is 'physical'. But using the core facilities provided by the Web, it's not
>possible to know that for sure without attempting that first access.

Im not sure if RDF and Web ontologies count as 'core', but I am 
assuming them to be part of the Web.

>Some systems may, of course, provide mechanisms for making assertions
>about URIs that can help avoid the need to attempt an access in order to
>find out about what is being referenced.

Without some such making of assertions, it is impossible to either 
determine or record what kind of entity a URI is being used to refer 
to.

>But since URIs are universal,
>other more general systems, encountering such URIs, may attempt access
>because they are unaware of that additional information and hence don't
>know any better.

Isnt that what 404 errors are for?

>Only by attempting an access can they find out anything
>more about the resource.

Again, I think this is a mistake. Failure of access tells you that 
some information isn't in the place you thought it might be, given 
its name. That isn't the same kind of question as asking what the 
name denotes. If the denoted resource is a person or a book, you 
cannot possibly find out anything by attempting to access it, other 
than it cannot be accessed. (And of course there can be any number of 
other reasons why it cannot be accessed.) Notice Im distinguishing 
here between poking blindly to see what happens, and 'attempting to 
access' a particular thing.

>You are probably aware that the behaviour
>associated with attempts to access URIs that identify physical things
>occupies a large part of the httpRange-14 finding on which the TAG is
>currently working.

Yes, and I think that this finding is profoundly flawed. (BTW, 
non-information resources do not have to be physical: for example, 
fictional characters cannot be poked, and neither can integers or 
relations or classes.)

>Ok, so now I'd better try and phrase the questions. First, does it sound,
>from what I've written here, as though I understood your response
>correctly, or have I simply missed the point?

I think you have missed my point, yes. You seem to be working on the 
assumption that the only way to discover whether a URI >refers to< 
something is to use it to access (and if it succeeds, then it refers 
to the [source of the] retrieved data, more or less), i.e. that 
successful access determines reference, which is exactly what I am 
arguing should NOT be assumed.

>Second, if I have understood
>your response, does this throw any light on why it's important that
>attempts to access physical things identified by URIs need to be supported
>by the Web in order for there to be a general mechanism by which it is
>possible to discover that the thing is indeed physical?

Im afraid not. Why does the current (non-semantic) Web care what a 
URI refers to, and whether or not that thing is physical?  Suppose I 
were to suggest that the Web suffers from a vitality crisis, in that 
some URIs refer to living things and others do not, and it is vital 
for HTTP engines to distinguish these, so any HTTP GET should emit a 
special error code (909) to signal that its referent is alive. This 
is almost exactly similar to the httpRange-14 finding, and almost 
exactly as silly.

The Web without the SWeb simply does not concern itself with semantic 
questions of reference >>at all<<. They do not arise. It is concerned 
only with moving chunks of information from place to place (and 
archiving them, and so forth: I do not mean to suggest this is all 
trivial or not worth serious effort to properly design.) Reference is 
a semantic notion, concerned with what the names in the texts refer 
to. These have nothing to do with one another. That http-Range-14 is 
even being discussed by the TAG at all is a symptom of a confusion 
between two distinct notions, both of which are being referred to by 
the word 'identifies'

Do you see my point?

>Very best wishes

and to you

Pat

>Rhys Lewis

PS. That reads like a Welsh name. I have happy memories of an early 
childhood in Maesteg, in the Llynfi valley.



-- 
---------------------------------------------------------------------
IHMC		(850)434 8903 or (650)494 3973   home
40 South Alcaniz St.	(850)202 4416   office
Pensacola			(850)202 4440   fax
FL 32502			(850)291 0667    cell
phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
Received on Friday, 6 July 2007 17:35:56 UTC