Re: Resources and URIs from pat hayes on 2003-04-25 (uri@w3.org from April 2003)

From: pat hayes <phayes@ai.uwf.edu>
Date: Fri, 25 Apr 2003 15:32:51 -0500
To: Graham Klyne <gk@ninebynine.org>
Cc: uri@w3.org
Message-Id: <p05210602bacf2484e499@[10.0.100.12]>
>[Offlist... deviating]
>
>At 13:19 24/04/2003 -0500, you wrote:
>>>At 16:51 23/04/2003 -0700, Roy T. Fielding wrote:
>>>>>"This document specifies the syntax of URIs, which are a form of 
>>>>>global identifier used in Web protocols and languages. 
>>>>>Particular uses of URIs, and their intended meanings in various 
>>>>>contexts, are described in other specifications. In general, the 
>>>>>entities referred to or identified by URIs when used in Web 
>>>>>contexts are called "resources"., but this document does not 
>>>>>specify the nature of resources or to restrict resources to any 
>>>>>particular category of entities."
>>>>>
>>>>>and leave it at that.  Nothing else at all about resources, no 
>>>>>examples, no discussion.
>>>>
>>>>No.  Look, you guys aren't the ones who have to answer questions in the
>>>>absence of definitions.  I do.    I refuse to leave what has been deployed
>>>>in an unspecified state, regardless of how many arguments that causes
>>>>in the Semantic Web.
>>>
>>>OK.  I assume this is in response to the phrase "this document 
>>>does not specify ...".
>>>
>>>I think the argument about 'refer' vs 'identify' is a bit sterile, 
>>>because I think I can supply an identifier for anything that Pat 
>>>can refer to, and I think that anything with an identifier can be 
>>>said to have identity (the identifier being sufficient if not 
>>>necessary); and clearly anything identified can be referenced.
>>
>>Well, I'm worried about things like descriptions. Here's a topical 
>>example. In designing DQL we had to consider the case where a 
>>server is able to prove that that something exists which satisfies 
>>the query but it has no URI to hand back as a binder to the 
>>must-bind variable in the query pattern.  Now, DQL can make up a 
>>URIreference of its own to be the 'name' of the thing it knows must 
>>exist. Fine so far, but is it really correct to say that this URI 
>>*identifies* that thing? I honestly do not know, but I am worried 
>>that many folk (eg Michael Mealling <michael@neonym.net>, at a 
>>guess) will read that as meaning something much stronger, for 
>>example as meaning that the URIref has been 'bound' to the thing in 
>>some sense (I'm not sure what this means, but it sounds a lot 
>>tighter than merely being used to refer to, in a query 
>>transaction).  With that understanding, what the DQL server does in 
>>this case could be an incorrect or inappropriate use of URIrefs, or 
>>'gibberish', in Michael's phrase.
>
>I don't perceive this as a "what is a resource" issue so much as a 
>(possible) violation of the expectation that a given URI identifies 
>some particular resource.

But look, I'm trying to find out what that MEANS.  All these words 
like "identify", "bound to" are terms of art which aren't being 
defined.  What am I committing to if I agree that a URI "identifies" 
something? I'm pretty sure that Michael and other folk think I'm 
committing to a whole lot more than I want to commit to.  For 
example, if it means that for each URI there has to be one single 
thing which that URI must denote, then I certainly don't want to 
commit to that (see later for why: basically, this is a claim that 
URIs have magical powers.).

In the case in question, the server knows that there is a thing that 
satisfies the query; but that is ALL it knows.  It can't 'identify' a 
particular thing, it can't access it, it has no representation of it 
(other than the assertion in the query pattern itself), there is no 
web-accessible thing that can be said to be 'about' it:  none of that 
at all, nothing that fits into the REST model. It just knows that 
something exists. Now, is that enough to say that it has 'identified' 
that thing (about which it knows diddly-squat other than that it 
exists) ?? Or not?? I just want a clear answer one way or the other, 
is all.  I want to know what this word "identifies" is supposed to 
mean.

Note, it would be wrong to say that there is a single unique thing 
and the server can 'identify' that thing in the sense of single it 
out from all other things. It can't. That is a *much* stronger claim 
than it's knowing that there is only one such thing, eg consider the 
difference between 'Someone in this room murdered the Count' and 
(pointing at someone) 'That person murdered the Count'.

>  Isn't this why we introduced bnodes distinct from URIs?

For use in reasoning, yes, but there is no way to pass a bnode 
outside its scope as an answer binding (and it would be meaningless 
to do so); and the only kinds of bindings we have are URIrefs.

>If there were exactly one thing that satisfied the query, then I 
>don't see the generated URI being problematic, but having it stand 
>for any of some collection of possible (query result) values then 
>it's more questionable.

It doesn't stand for the collection, it stands for a thing known to 
exist.  Of course if there were more than one of them, then it is 
ambiguous; but since it is ambiguous anyway, that hardly seems to 
matter.

>  I suppose you might finesse the meaning by saying its bound to some 
>particular value without providing any further means to figure which 
>one

Exactly; that is what the model theory says about all URIrefs, 
though, and has always said about them, since they are always 
interpreted relative to an interpretation.

You tell me, how exactly does one 'figure out' what the URI 
http://www.w3.org/2001/XMLSchema is "bound" to? Even a human being? 
Try looking up "namespace" in http://www.m-w.com/netdict.htm and see 
what you get, for a start.

>, but this is starting to look to me like a problem looking for 
>somewhere to happen.  Even if the generated name is truly unique, 
>never to be returned again, other folks might still start trying to 
>make assertions about it ("I think it's the first value that matches 
>the query" and "I think it's the smallest value that matches the 
>query", etc., so it starts to acquire an specificity that really was 
>not intended for the query result.

Yes, but that is true for all URIrefs, everywhere.  Anyone else can 
assert that my webpage denotes the Queen.  But these URIrefs 'belong' 
to the server, so any assertions made about them from another source 
are not warranted by the server.  I don't see any problems arising 
here that aren't endemic to the entire Web (and so are unlikely to be 
real problems, in fact, since the Web seems to work quite well.)

But if this bothers you, assume that the server knows that something 
exists and that it is unique, i.e. that there is only one thing 
satisfying the query (eg it might know that the set of Joe's siblings 
has cardinality 1).  Still, if that is ALL it knows, is it OK to say 
that this thing is "identified", and that if the server makes up a 
URI for it, that the URI is "bound" to it? How does one answer a 
question like that?

>Another example:  when CWM

Well, CWM violates the RDF spec.

>ses unbound graph nodes in queries, not quite the same as what you 
>describe, it requires the names to be qualified with a log:forall 
>qualifier, which is all beyond standard RDF as I understand it.

Indeed, way beyond. And that seems to be about bnodes in the query 
(which act like universals rather than existentials, in a query), not 
bindings to query variables.

>  There's also a log:forSome which may be used for the case of 
>unbound query results, maybe similar to your DQL case.

Well, certainly things would be easier for DQL if it were possible to 
return an explicit existential, but the basic issue about URIs would 
still remain. If you like, think of it as the question, is it kosher 
to Skolemize on the Web? That is, can I make up a new URI and say 
that it denotes something, just on the basis of knowing that 
something *exists*, and not knowing anything else about it? If so, 
how do I "bind" this thing - about which I know virtually nothing - 
to my URI, or make my URI "identify" it?  Particularly if I can't 
identify it and I have nothing to bind it with.

>[[
><SyntaxArc rdf:about="http://www.w3.org/2000/10/swap/log#forAll">
><rdfs:comment>A is true for any object in place of B.</rdfs:comment>
><rdfs:label>for All</rdfs:label>
></SyntaxArc>
>
><rdf:SyntaxArc rdf:about="http://www.w3.org/2000/10/swap/log#forSome">
><rdfs:comment>
>A is true for some object for which here we use B. This
>         is NOT a real rdf property in its behaviour - a pseudoproperty.
>         For example, removal of it from a formula does nor
>         preserve truth, and substitution is not permitted on its object.
></rdfs:comment>
><rdfs:domain rdf:resource="http://www.w3.org/2000/10/swap/log#Formula"/>
></rdf:SyntaxArc>
>]]
>-- http://www.w3.org/2000/10/swap/log
>
>>If RFC 2396 rules that this is inappropriate then we will probably 
>>redesign DQL.  Which we could do, with some pain, but I would like 
>>to know clearly one way or the other.  If I read "anything that can 
>>be identified ..." somewhere, with no further exposition, I still 
>>don't know whether this is OK or not, because I don't know what 
>>"identified" is supposed to mean. In DQL usage, it's certainly not 
>>what is meant by using an 'identifier' in a programming language, 
>>it doesn't cause the thing that exists to have an identity (if it 
>>didn't have one already), and it doesn't imply any kind of 
>>binding-to going on anywhere.
>
>So I think there are two questions:
>
>(1) what is a resource?
>(2) does a URI identifiy a single particular resource?
>
>I think the answer to (2) is "yes" by my understanding of URIs (e.g. 
>RFC2396 section 1.1:  "An identifier is an object that can act as a 
>reference to *something* [that has identity]."  Even if you ignore 
>the problematic words [that has identity] (I think they're redundant 
>here), I think the words still say that the identifier refers to a 
>single entity:  "something" is singular.

I still want to know what it means for something to be "identified". 
It sounds like you are saying that it means that there is a single 
thing - an actual thing, not a representation of a thing - which the 
URI has to denote.  That is admittedly clear, but it has the 
disadvantage of being an impossible fantasy.  If true, it would mean 
that URIs had magical properties.

This 'unique referent' claim, if taken seriously, is an incredibly 
strong claim. It seems to be predicated on an assumption about the 
Web which is false of all other known representational and linguistic 
schemes, that names are 'true names' which *inherently*, in their 
very nature, identify a single thing in all possible interpretations; 
URIs, according to this, are names with LOGICALLY NECESSARY 
denotations.  No other names are like that, in any human or 
artificial language or naming scheme ever devised, except maybe 
numerals (and even then not if you want to stay computable.).  Even 
if you were to physically attach the names to their intended 
referents, like name badges worn by people at a symposium, there 
would still be some  ambiguity: does it denote the person at that 
moment, the person considered as a citizen, the person's body, the 
person's clothing, the role they are playing in the gaming 
convention....?? I know there is an answer in the case of name 
badges, but the point is that this answer depends on an external 
convention known to the users, an implicit shared set of assumptions, 
a background. It's not inherent in the very idea of a name badge. You 
CAN interpret name badges differently, and I expect some symposia do. 
And as soon as you allow this kind of contextuality, you lose 
uniqueness of denotation. But according to what you say here, it is 
*logically impossible* to mis-interpret a URI.

Here's how this manifests itself in a web formalism, like RDF. 
Suppose I make some RDF (or whatever) assertions about a thing using 
its URI. If there is a single referent, it must be the same referent 
*in all possible interpretations of my assertions*. So what I write 
just IS true or false of that thing, and a reasoning engine ought to 
be able to find out which *just by looking at the URI*.  But of 
course it can't possibly find it out in that way, in general, even if 
we allow it to use Web machinery on the URI; and the reason it can't 
is because this assumption is completely false: what a URI denotes 
*depends on the interpretation*, just like names and referring 
expressions in all the other languages and notational schemes ever 
invented.

This point is made even stronger if the things that one GETs by using 
a URI are considered to be representations of resources, since then 
the meaning of the representation depends on which semantic 
conventions are applied to it.

>As for the answer to (1), I agree with you (modulo debatable 
>edge-cases I don't understand) that a resource is, practically, 
>anything.
>
>Did you have any problem with my suggested revision to your words, 
>that were an attempt to overcome Roy's objection?

You mean
[[
The entities referred to or identified by URIs when used in Web 
contexts are called "resources".  Anything can be a resource:  there 
is no restriction on the nature of resources."
]]

Fine with me.

Pat
-- 
---------------------------------------------------------------------
IHMC					(850)434 8903 or (650)494 3973   home
40 South Alcaniz St.			(850)202 4416   office
Pensacola              			(850)202 4440   fax
FL 32501           				(850)291 0667    cell
phayes@ai.uwf.edu	          http://www.coginst.uwf.edu/~phayes
s.pam@ai.uwf.edu   for spam
Received on Friday, 25 April 2003 16:32:54 UTC