RE: Terminology (was Re: article on URIs, is this material that can be used by the) from Rhys Lewis on 2007-07-09 (www-tag@w3.org from July 2007)

From: Rhys Lewis <rhys@volantis.com>
Date: Mon, 9 Jul 2007 04:42:15 -0700 (PDT)
To: "'Pat Hayes'" <phayes@ihmc.us>
Cc: <noah_mendelsohn@us.ibm.com>, "'Dan Brickley'" <danbri@danbri.org>, "'Henry S. Thompson'" <ht@inf.ed.ac.uk>, "'Tim Berners-Lee'" <timbl@w3.org>, <www-tag@w3.org>
Message-ID: <001201c7c21e$7a90dcf0$84a6f40a@volantisuk>
Hello Pat,

Just wanted to say thanks for your response. That was very helpful to me
in my attempts to understand this particular topic.

And you are absolutely correct, I had missed a particularly important
point.

I need to think about this a bit more before following up.

Best wishes
Rhys 

-----Original Message-----
From: Pat Hayes [mailto:phayes@ihmc.us] 
Sent: 06 July 2007 18:36
To: Rhys Lewis
Cc: noah_mendelsohn@us.ibm.com; Dan Brickley; Henry S. Thompson; Tim
Berners-Lee; www-tag@w3.org
Subject: RE: Terminology (was Re: article on URIs, is this material that
can be used by the)

>
>Hello Pat,
>
>I'd like to ask my first dumb questions, if I may.

Im sure they won't be dumb.

>They concern your
>recent response to Noah's comments.
>
>Noah wrote:
>"Here's what I think may be the essence of the confusion:  there are 
>certain systems in which it is by definition possible to attempt to 
>access anything that can be referenced."
>
>You responded:
>"Indeed this may well be part of the confusion. OK, there are a few 
>such systems (a VERY few), but the Web is not one of them; or at least, 
>not if its understood as described in the Web Architecture document. 
>That document is at pains to explain, quite early, that resources 
>identified by URIs can be physical things not connected in any way to 
>the Internet, such as books and people. From that point on, all talk 
>about attempting to access things that can be referenced is obviously 
>crazy. You can't use HTTP or any other xxTP to access people and books. 
>(You can maybe access some kind of description of them, which could be 
>called a representation of them, although not in the same sense of 
>"representation" used by you and the architecture document; and not by 
>getting a TP poke to them and causing them to emit a representation in 
>response. But you didn't say
>'representation': you said, access anyTHING that can be referenced.)"
>
>My question concerns whether I've interpreted your response correctly. 
>I was somewhat surprised by your apparent assertion that the Web is not 
>a system which Noah characterised as one 'in which by definition it is 
>possible to ATTEMPT (my emphasis) to access anything that can be 
>referenced'.

Before reading on, let me emphasize that I am taking Noah's words here at
their face value. That is, the situation as he describes it is one where
(1) there is a URI (2) it is known that this URI is intended to reference,
say, a person or a book (a "non-information resource"), and (3) it makes
sense to attempt to access this person or book ("the thing referenced").
But I know, without further ado, that it does not make sense even to
attempt to access a person or book (or galaxy, ..etc.).

>Let me first try and explain my surprise.
>
>It seems to me that the Web has always had the property that you can 
>attempt to access that which a URI identifies (I'm assuming HTTP 
>throughout this), but there is no guarantee that the access operation 
>will be successful.

You can attempt to access >something<, yes. But whether or not that thing
that you attempt to access is (or can possibly be) what the URI >refers
to< is an open question. (I really, honestly, do not know what is meant by
the word 'identify' in this context: it seems to change its meaning at
almost every usage.) One thing we can be certain of, however, is that >if<
the things that can be referred to include books, people and galaxies,
then there >must< be cases where it does not make sense to attempt to
access what is being referred to.

>Indeed, there are no guarantees about what might be returned in 
>response to a successful access operation. This is a consequence of the 
>localisation, of the definitive information about that which a URI 
>identifies, to the resource itself. It is, I believe, in large part the 
>source of the Web's scalability.

I agree. Long live 404 errors.

>Clearly, it would be strange to try and access something via the Web 
>which was known to be 'physical' and hence not able to emit anything in 
>response to a transport protocol 'poke'.

Quite.

>However, in general, the Web doesn't
>provide any guaranteed way to know that in advance.

Ah, but the semantic web does. If someone asserts

http://ex:thingie rdf:type dc:author .

and I have good reason to believe it, then I have good reason to believe
that 'http://ex:thingie' denotes a person.

>It's not possible to
>know whether that which a URI identifies will respond to such a 'poke', 
>by returning some material, without actually trying it.

Wait. It isn't possible to know what you will get back by poking a URI
without trying the poke and seeing. Yes, true. But as to whether what you
are poking is the >referent< of the URI, that is a different question. We
know (see above) there must be cases where it cannot possibly be, because
what the URI refers to simply isn't the kind of thing that can get poked
in the required way.

So yes, the only way to tell what the accessible thing (if there is
one) does is to poke it and see; but that is irrelevant to questions of
reference. Neither the thing poked nor what it sends back in return need
be what the URI >denotes< or >refers to<. This is one good reason why we
need to distinguish between what you get when you poke with it, and what
it refers to. They need not be the same. I think that if the SWeb ever
takes off (or begins to slide downhill, using Tim's bobsled metaphor) then
*most* URIs will be like this.

>Now, I agree with you that it could well be described as crazy to make 
>subsequent attempts to access something via the Web that you already 
>know is 'physical'. But using the core facilities provided by the Web, 
>it's not possible to know that for sure without attempting that first
access.

Im not sure if RDF and Web ontologies count as 'core', but I am assuming
them to be part of the Web.

>Some systems may, of course, provide mechanisms for making assertions 
>about URIs that can help avoid the need to attempt an access in order 
>to find out about what is being referenced.

Without some such making of assertions, it is impossible to either
determine or record what kind of entity a URI is being used to refer to.

>But since URIs are universal,
>other more general systems, encountering such URIs, may attempt access 
>because they are unaware of that additional information and hence don't 
>know any better.

Isnt that what 404 errors are for?

>Only by attempting an access can they find out anything more about the 
>resource.

Again, I think this is a mistake. Failure of access tells you that some
information isn't in the place you thought it might be, given its name.
That isn't the same kind of question as asking what the name denotes. If
the denoted resource is a person or a book, you cannot possibly find out
anything by attempting to access it, other than it cannot be accessed.
(And of course there can be any number of other reasons why it cannot be
accessed.) Notice Im distinguishing here between poking blindly to see
what happens, and 'attempting to access' a particular thing.

>You are probably aware that the behaviour associated with attempts to 
>access URIs that identify physical things occupies a large part of the 
>httpRange-14 finding on which the TAG is currently working.

Yes, and I think that this finding is profoundly flawed. (BTW,
non-information resources do not have to be physical: for example,
fictional characters cannot be poked, and neither can integers or
relations or classes.)

>Ok, so now I'd better try and phrase the questions. First, does it 
>sound, from what I've written here, as though I understood your 
>response correctly, or have I simply missed the point?

I think you have missed my point, yes. You seem to be working on the
assumption that the only way to discover whether a URI >refers to<
something is to use it to access (and if it succeeds, then it refers to
the [source of the] retrieved data, more or less), i.e. that successful
access determines reference, which is exactly what I am arguing should NOT
be assumed.

>Second, if I have understood
>your response, does this throw any light on why it's important that 
>attempts to access physical things identified by URIs need to be 
>supported by the Web in order for there to be a general mechanism by 
>which it is possible to discover that the thing is indeed physical?

Im afraid not. Why does the current (non-semantic) Web care what a URI
refers to, and whether or not that thing is physical?  Suppose I were to
suggest that the Web suffers from a vitality crisis, in that some URIs
refer to living things and others do not, and it is vital for HTTP engines
to distinguish these, so any HTTP GET should emit a special error code
(909) to signal that its referent is alive. This is almost exactly similar
to the httpRange-14 finding, and almost exactly as silly.

The Web without the SWeb simply does not concern itself with semantic
questions of reference >>at all<<. They do not arise. It is concerned only
with moving chunks of information from place to place (and archiving them,
and so forth: I do not mean to suggest this is all trivial or not worth
serious effort to properly design.) Reference is a semantic notion,
concerned with what the names in the texts refer to. These have nothing to
do with one another. That http-Range-14 is even being discussed by the TAG
at all is a symptom of a confusion between two distinct notions, both of
which are being referred to by the word 'identifies'

Do you see my point?

>Very best wishes

and to you

Pat

>Rhys Lewis

PS. That reads like a Welsh name. I have happy memories of an early
childhood in Maesteg, in the Llynfi valley.



-- 
---------------------------------------------------------------------
IHMC		(850)434 8903 or (650)494 3973   home
40 South Alcaniz St.	(850)202 4416   office
Pensacola			(850)202 4440   fax
FL 32502			(850)291 0667    cell
phayesAT-SIGNihmc.us       http://www.ihmc.us/users/phayes
Received on Monday, 9 July 2007 11:42:35 UTC