W3C home > Mailing lists > Public > www-tag@w3.org > July 2007

Re: Terminology (was Re: article on URIs, is this material that can be used by the)

From: Stuart Williams <skw@hp.com>
Date: Mon, 09 Jul 2007 12:38:48 +0100
Message-ID: <46921E48.2010001@hp.com>
To: Pat Hayes <phayes@ihmc.us>
CC: Rhys Lewis <rhys@volantis.com>, noah_mendelsohn@us.ibm.com, Dan Brickley <danbri@danbri.org>, "Henry S. Thompson" <ht@inf.ed.ac.uk>, Tim Berners-Lee <timbl@w3.org>, www-tag@w3.org

Hello Pat,

With some trepidation I'll put my foot into this :-)

Reflecting on what I've written below... I think the core question that 
I have for you, Pat, is about 'bootstrapping'. How do you find out 
something about something armed only with a reference to it? In part, 
your response to Rhys about the lack of utility of the attempted direct 
reference is base on having prior knowledge of its futility. My question 
is how did 'you' come by that prior knowledge? And... in absense of any 
knowledge, only a referring name what do you do if you want to know more 
about the referent?

By all means read the rest of what I've written below... but i think 
that the questions above are the kernel of it.

 
Pat Hayes wrote:
>>
>> Hello Pat,
>>
>> I'd like to ask my first dumb questions, if I may.
>
> Im sure they won't be dumb.
>
>> They concern your
>> recent response to Noah's comments.
>>
>> Noah wrote:
>> "Here's what I think may be the essence of the confusion:  there are
>> certain systems in which it is by definition possible to attempt to
>> access anything that can be referenced."
>>
>> You responded:
>> "Indeed this may well be part of the confusion. OK, there are a few such
>> systems (a VERY few), but the Web is not one of them; or at least, 
>> not if
>> its understood as described in the Web Architecture document. That
>> document is at pains to explain, quite early, that resources 
>> identified by
>> URIs can be physical things not connected in any way to the Internet, 
>> such
>> as books and people. From that point on, all talk about attempting to
>> access things that can be referenced is obviously crazy. You can't use
>> HTTP or any other xxTP to access people and books. (You can maybe access
>> some kind of description of them, which could be called a representation
>> of them, although not in the same sense of "representation" used by you
>> and the architecture document; and not by getting a TP poke to them and
>> causing them to emit a representation in response. But you didn't say
>> 'representation': you said, access anyTHING that can be referenced.)"
>>
>> My question concerns whether I've interpreted your response correctly. I
>> was somewhat surprised by your apparent assertion that the Web is not a
>> system which Noah characterised as one 'in which by definition it is
>> possible to ATTEMPT (my emphasis) to access anything that can be
>> referenced'.
>
> Before reading on, let me emphasize that I am taking Noah's words here 
> at their face value. That is, the situation as he describes it is one 
> where (1) there is a URI (2) it is known that this URI is intended to 
> reference, say, a person or a book (a "non-information resource"), and 
> (3) it makes sense to attempt to access this person or book ("the 
> thing referenced"). But I know, without further ado, that it does not 
> make sense even to attempt to access a person or book (or galaxy, 
> ..etc.).
Hmmm.... but let's stay that you have no idea as to what the URI refers. 
It may refer to a person or a person or a galaxy... but for whatever 
reason you (or your agent) have no idea, a priori, all that you have is 
a (URI) name and a reference made using it (it's occurence in a 'text' 
sometimes aka webarch:Representation or it having been typed into the 
'address bar' of a semantic web browser).

Where do we go from here with such little information (just the name).

If, in the absense of other knowledge, we attempt to 'poke' (i.e. 
access) the thing referred to by giving the name to an HTTP engine and 
asking it to do a GET (sometimes called 'dereference' - which I'll agree 
does conflate (de-)reference and access) there is not much that HTTP 
itself can tell us about the resource.

If what is referred to is a webarch:InformationResource it can return a 
200 status code and a webarch:Representation. Webarch 'defines' 
representation as a message - a sequence of bits/octets exchanged over a 
medium; IMO, something ephemeral - and something which itself generally 
has no URI. However, webarch is less clear about what the 
webarch:Representation is a representation of.  Oh... the corresponding 
webarch:Resource... doh! Well, 'all of it'? or just 'its current state'? 
or a 'description of it?' I find that I can 'muddle' along without quite 
having to ground that one out, but it does bother me from time to time.

Descriptions of a thing can clearly be 3rd party and have many origins - 
there may be many available descriptions of you or I with varying levels 
of veracity. A webarch:Representation (a text) that happens to contain a 
description could be self-describing (could make assertions about self) 
or be descriptive of a multiplicity of things (including itself or not). 
But it is still not clear whether a text obtained by using the name of a 
thing to attempt to access the named thing where the text  contains a 
description of the named thing should be regarded as a 
webarch:Representation 'of the thing' or 'of a thing that describes the 
thing'. Of course I expect little sympathy over this "...you tell me 
what y'all want it to mean." :-)

So... if what is being referred to happens to be (although we don't know 
this yet) a person or a book or a galaxy or an integer or a mathematical 
concept or whatever. Well, probably best not to arrange (through 
deployment of web infrastructure) for a 200 response and a 
webarch:Representation to be returned in response to an attempt 'access' 
the named thing using HTTP. 404 Resource not found is not hugely helpful 
either in giving us an account of what was being referenced. Now the 
thing being referenced is amenable to description (however imperfect). 
So... instead of providing a webarch:Representation of the referenced 
thing, a webarch:Representation (a text) conveying a description of the 
referenced thing might be useful, and might be able to give us more 
definite information about the thing originally referenced by the name 
that we have. But wait a minute.... if we simply responded a 
webarch:Representation of such a description (which is but one of many 
possible descriptions - the sources of which we may trust to varying 
degrees - and which may be contradictory...) do we not risk confusing 
the referenced thing with the particular description (if any) that is 
returned.

So... rather than arrange (through the deployment of web infrastructure) 
for a direct 200 response with a webarch:Representation containing a 
description of the referenced thing.... better to say no representation 
here but you may find something of useful over there an provide another 
refering name that references a thing that (hopefully) describes 
(amongst other things) the thing that was referred to originally.

This later is what the "303 see other" advice from the TAG is about. A 
corollary of following this advise is that you don't deploy 
webarch:Representations for people, places, books, galaxies etc. Instead 
you arrange that attempts to 'poke' the referenced thing result in a 
visible (in the sense that the thing doing the poking is aware of it) 
redirection to something else that provides a description of the thin 
originally referenced.

In all of this HTTP gives us no definitive means of 'knowing' that the 
referenced thing is *not* an information resource - it could be. But is 
does provide the means to 'bootstrap' a description of the referenced 
thing at the next layer up (eg. a narrative description for a human 
being or a machine readable description for a mechanised agent.

[By 'you' in the above - I mean the person or entity that is 
establishing the association between a URI (which they have some 
presumed authority to associated with a thing) and a thing such that 
uses of the name as a reference are intended to refer to the associated 
thing.]

Does this make any sense to you? In the past I think you have referred 
to the TAG's "303 advice"  as lunacy or words to that effect.

For myself, though not a TAG member at the time, I find it quite 
appealing. It allows that HTTP URIs (with or without #'s) can be used to 
refer to any kind of thing, and in those cases where the thing is not a 
webarch:InformationResource it provides a means to obtain a 
(webarch:Representation of a) description of the referenced thing. 
And... in the process provides for distingished URI names for the thing 
its description.

<snip/>

>> However, in general, the Web doesn't
>> provide any guaranteed way to know that in advance.
>
> Ah, but the semantic web does. If someone asserts
>
> http://ex:thingie rdf:type dc:author . 
>
>
> and I have good reason to believe it, then I have good reason to 
> believe that 'http://ex:thingie' denotes a person.
Ok... but how would you arrange to obtain such an assertion (from the 
web) armed only with the URI http://ex:thingie?
>> It's not possible to
>> know whether that which a URI identifies will respond to such a 
>> 'poke', by
>> returning some material, without actually trying it.
>
> Wait. It isn't possible to know what you will get back by poking a URI 
> without trying the poke and seeing. Yes, true. But as to whether what 
> you are poking is the >referent< of the URI, that is a different 
> question. We know (see above) there must be cases where it cannot 
> possibly be, because what the URI refers to simply isn't the kind of 
> thing that can get poked in the required way.
Ok... but in the 303 case, surely one can argue that what is poke'd is 
not the referent itself, but some proxy that has been placed by the 
entity that wishes to have the URI (name) associated with the referent.

I think that one could also argue that the same association could be 
established with the direct provision of the description, arguing that 
the responding resource stands proxy for the referent rather than being 
the referent. In this case one would choose not to care about the 
information/non-information nature of a resource (a choice that many 
would make - but some find the distinction important).

I find some irony in that the introduction of the notion of an 
Information Resource was at least in part motivated as a response (with 
which I believe you were happy with  - at least at some point in time) 
to some of your comments on Webarch. Indeed think that you may even have 
said earlier in this thread that the distinction is a useful one to make.

Maybe what you are 'objecting' to is the provision of any mechanism to 
signal such a distinction in HTTP... and the mechanism advised by the 
TAG is only partial in any case, given that (if you buy into it) it can 
only signal for certain that a referent is an information resource.
> So yes, the only way to tell what the accessible thing (if there is 
> one) does is to poke it and see; but that is irrelevant to questions 
> of reference. Neither the thing poked nor what it sends back in return 
> need be what the URI >denotes< or >refers to<. This is one good reason 
> why we need to distinguish between what you get when you poke with it, 
> and what it refers to. They need not be the same. I think that if the 
> SWeb ever takes off (or begins to slide downhill, using Tim's bobsled 
> metaphor) then *most* URIs will be like this.
>
>> Now, I agree with you that it could well be described as crazy to make
>> subsequent attempts to access something via the Web that you already 
>> know
>> is 'physical'. But using the core facilities provided by the Web, 
>> it's not
>> possible to know that for sure without attempting that first access.
>
> Im not sure if RDF and Web ontologies count as 'core', but I am 
> assuming them to be part of the Web.

Ok.... but somehow you still have to have obtained that first assertion 
that tells you that the referent of http://example.or/JC is in fact a 
foaf:Person, and have chosen to believe it. Armed with some persistence 
of that assertion, yes, one need never need to go there again. And yes, 
I agree that you may have happened upon this assertion without ever 
having to have 'poked' http://example.org/JC, and so you need never need 
to go there to find out some fact about the referent. But, what of the 
case, at the beginning, where all you have is a name. What are your 
strategies for finding out something? There may be many... but one 
'obvious' one is an attempted a direct access.; another is to ask 
someone that you trust.

>> Some systems may, of course, provide mechanisms for making assertions
>> about URIs that can help avoid the need to attempt an access in order to
>> find out about what is being referenced.
>
> Without some such making of assertions, it is impossible to either 
> determine or record what kind of entity a URI is being used to refer to.
>
>> But since URIs are universal,
>> other more general systems, encountering such URIs, may attempt access
>> because they are unaware of that additional information and hence don't
>> know any better.
>
> Isnt that what 404 errors are for?
>
>> Only by attempting an access can they find out anything
>> more about the resource.
>
> Again, I think this is a mistake. Failure of access tells you that 
> some information isn't in the place you thought it might be, given its 
> name. That isn't the same kind of question as asking what the name 
> denotes. If the denoted resource is a person or a book, you cannot 
> possibly find out anything by attempting to access it, other than it 
> cannot be accessed. (And of course there can be any number of other 
> reasons why it cannot be accessed.) Notice Im distinguishing here 
> between poking blindly to see what happens, and 'attempting to access' 
> a particular thing.
>
>> You are probably aware that the behaviour
>> associated with attempts to access URIs that identify physical things
>> occupies a large part of the httpRange-14 finding on which the TAG is
>> currently working.
>
> Yes, and I think that this finding is profoundly flawed. (BTW, 
> non-information resources do not have to be physical: for example, 
> fictional characters cannot be poked, and neither can integers or 
> relations or classes.)
>
>> Ok, so now I'd better try and phrase the questions. First, does it 
>> sound,
>> from what I've written here, as though I understood your response
>> correctly, or have I simply missed the point?
>
> I think you have missed my point, yes. You seem to be working on the 
> assumption that the only way to discover whether a URI >refers to< 
> something is to use it to access (and if it succeeds, then it refers 
> to the [source of the] retrieved data, more or less), i.e. that 
> successful access determines reference, which is exactly what I am 
> arguing should NOT be assumed.

I think we would only argue that attempted access is 'A' way of 
'POSSIBLE' discovery of  "whether a URI >refers to< something". FWIW, 
I'm also of the opinion that what you find out by poking may conflict 
with many other potential sources of 'discovery' and how you then sort 
out what you then believe... is something we have been (and probably 
will remain) silent about.
>
>> Second, if I have understood
>> your response, does this throw any light on why it's important that
>> attempts to access physical things identified by URIs need to be 
>> supported
>> by the Web in order for there to be a general mechanism by which it is
>> possible to discover that the thing is indeed physical?
>
> Im afraid not. Why does the current (non-semantic) Web care what a URI 
> refers to, and whether or not that thing is physical?  
I don't think that it does... and I think that for many who have 
observed this debate in all its forms over the past few years, we are 
indeed to many, well angels, dancing on the head of a pin.

What IS important to some is that the referent of a URI is invariant 
between the 'classic' Web and the Semantic Web. That they are not two 
distinct systems, but one, and thus the refering names of the Semantic 
Web are not entirely free names that could be bound to anything - but 
are constrained by the bindings imposed by the classic web.

> Suppose I were to suggest that the Web suffers from a vitality crisis, 
> in that some URIs refer to living things and others do not, and it is 
> vital for HTTP engines to distinguish these, so any HTTP GET should 
> emit a special error code (909) to signal that its referent is alive. 
> This is almost exactly similar to the httpRange-14 finding, and almost 
> exactly as silly.
Er...well, no actually... as you have remarked yourself, the 303 advice 
does *not* infact say that referent is alive... the 200/303 advice (of 
itself, if followed) only ever tells you that something is (at the 
moment!) an Information Resource (should that be of interest to you).
> The Web without the SWeb simply does not concern itself with semantic 
> questions of reference >>at all<<. They do not arise.
:-) Well, yes and no... the odd person will come to the point of 
wondering whether or not statements are being made about the actual 
weather in Oaxaca or merely a weather report about the same. Yes the 
technology of the web doesn't care - it only become an issue when people 
start to say things about what URI's refer to (from their pov). People 
have been making such assertions informally since before the Semantic 
Web - and yes, the Web itself didn't care whether they were wrong or right.
> It is concerned only with moving chunks of information from place to 
> place (and archiving them, and so forth: I do not mean to suggest this 
> is all trivial or not worth serious effort to properly design.) 
> Reference is a semantic notion, concerned with what the names in the 
> texts refer to. These have nothing to do with one another. That 
> http-Range-14 is even being discussed by the TAG at all is a symptom 
> of a confusion between two distinct notions, both of which are being 
> referred to by the word 'identifies'
Take the people away I guess we agree. However, in a hyper text document 
there is presumably some intention in the placement of a hypertext 
reference. The person making the reference in authoring the document has 
some sense of what the referent is - and to greater or lesser extent is 
trying to convey that sense to any one that happens to read their work. 
Yes...  I agree that the 'system' of the  classic web is ignorant of all 
that - but it's users (both authors and consumers) are not.
>
> Do you see my point?

I think so....
>
>> Very best wishes
>
> and to you
>
> Pat
>
>> Rhys Lewis
>
> PS. That reads like a Welsh name. I have happy memories of an early 
> childhood in Maesteg, in the Llynfi valley.
Best Regards

Stuart Williams
-- 

Hewlett-Packard Limited registered Office: Cain Road, Bracknell, Berks RG12 1HN
Registered No: 690597 England
Received on Monday, 9 July 2007 11:40:47 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 26 April 2012 12:47:46 GMT