Re: Resources and URIs

This message does not contribute directly to the RFC2396bis wording debate.

I'm trying to explore, in response to Pat's comments, reconciliation of 
formal notions of denotation with the commonly understood idea of URIs as 
identifiers.

At 15:32 25/04/2003 -0500, pat hayes wrote:

>>[Offlist... deviating]
>>
>>At 13:19 24/04/2003 -0500, you wrote:
>>>>At 16:51 23/04/2003 -0700, Roy T. Fielding wrote:
>>>>>>"This document specifies the syntax of URIs, which are a form of 
>>>>>>global identifier used in Web protocols and languages. Particular 
>>>>>>uses of URIs, and their intended meanings in various contexts, are 
>>>>>>described in other specifications. In general, the entities referred 
>>>>>>to or identified by URIs when used in Web contexts are called 
>>>>>>"resources"., but this document does not specify the nature of 
>>>>>>resources or to restrict resources to any particular category of entities."
>>>>>>
>>>>>>and leave it at that.  Nothing else at all about resources, no 
>>>>>>examples, no discussion.
>>>>>
>>>>>No.  Look, you guys aren't the ones who have to answer questions in the
>>>>>absence of definitions.  I do.    I refuse to leave what has been deployed
>>>>>in an unspecified state, regardless of how many arguments that causes
>>>>>in the Semantic Web.
>>>>
>>>>OK.  I assume this is in response to the phrase "this document does not 
>>>>specify ...".
>>>>
>>>>I think the argument about 'refer' vs 'identify' is a bit sterile, 
>>>>because I think I can supply an identifier for anything that Pat can 
>>>>refer to, and I think that anything with an identifier can be said to 
>>>>have identity (the identifier being sufficient if not necessary); and 
>>>>clearly anything identified can be referenced.
>>>
>>>Well, I'm worried about things like descriptions. Here's a topical 
>>>example. In designing DQL we had to consider the case where a server is 
>>>able to prove that that something exists which satisfies the query but 
>>>it has no URI to hand back as a binder to the must-bind variable in the 
>>>query pattern.  Now, DQL can make up a URIreference of its own to be the 
>>>'name' of the thing it knows must exist. Fine so far, but is it really 
>>>correct to say that this URI *identifies* that thing? I honestly do not 
>>>know, but I am worried that many folk (eg Michael Mealling 
>>><michael@neonym.net>, at a guess) will read that as meaning something 
>>>much stronger, for example as meaning that the URIref has been 'bound' 
>>>to the thing in some sense (I'm not sure what this means, but it sounds 
>>>a lot tighter than merely being used to refer to, in a query 
>>>transaction).  With that understanding, what the DQL server does in this 
>>>case could be an incorrect or inappropriate use of URIrefs, or 
>>>'gibberish', in Michael's phrase.
>>
>>I don't perceive this as a "what is a resource" issue so much as a 
>>(possible) violation of the expectation that a given URI identifies some 
>>particular resource.
>
>But look, I'm trying to find out what that MEANS.  All these words like 
>"identify", "bound to" are terms of art which aren't being defined.  What 
>am I committing to if I agree that a URI "identifies" something? I'm 
>pretty sure that Michael and other folk think I'm committing to a whole 
>lot more than I want to commit to.  For example, if it means that for each 
>URI there has to be one single thing which that URI must denote, then I 
>certainly don't want to commit to that (see later for why: basically, this 
>is a claim that URIs have magical powers.).

Oops, in that message, to you, I should have said "denotes".  (I was trying 
to separate the nature-of-resource question from what-is-identity question, 
and failed.)

As for the uniqueness of denotation... I think I begin to see the problem 
here.  .....

>In the case in question, the server knows that there is a thing that 
>satisfies the query; but that is ALL it knows.  It can't 'identify' a 
>particular thing, it can't access it, it has no representation of it 
>(other than the assertion in the query pattern itself), there is no 
>web-accessible thing that can be said to be 'about' it:  none of that at 
>all, nothing that fits into the REST model. It just knows that something 
>exists. Now, is that enough to say that it has 'identified' that thing 
>(about which it knows diddly-squat other than that it exists) ?? Or not?? 
>I just want a clear answer one way or the other, is all.  I want to know 
>what this word "identifies" is supposed to mean.
>
>Note, it would be wrong to say that there is a single unique thing and the 
>server can 'identify' that thing in the sense of single it out from all 
>other things. It can't. That is a *much* stronger claim than it's knowing 
>that there is only one such thing, eg consider the difference between 
>'Someone in this room murdered the Count' and (pointing at someone) 'That 
>person murdered the Count'.
>
>>  Isn't this why we introduced bnodes distinct from URIs?
>
>For use in reasoning, yes, but there is no way to pass a bnode outside its 
>scope as an answer binding (and it would be meaningless to do so); and the 
>only kinds of bindings we have are URIrefs.

In logic, I think one uses expressions like (exists x . f(x)) to express 
such.    Is a query answer is an RDF graph with bnodes, I interpret that as 
meaning much the same thing -- I'm not asking to export the binding, just 
express the query result.  My concern is that URIs are not the right form 
to denote some elements of query results, because some parts of query 
results are not "resources" in the intended sense.

>>If there were exactly one thing that satisfied the query, then I don't 
>>see the generated URI being problematic, but having it stand for any of 
>>some collection of possible (query result) values then it's more questionable.
>
>It doesn't stand for the collection, it stands for a thing known to 
>exist.  Of course if there were more than one of them, then it is 
>ambiguous; but since it is ambiguous anyway, that hardly seems to matter.

I don't see that contradicting what I'm trying to say.

>>  I suppose you might finesse the meaning by saying its bound to some 
>> particular value without providing any further means to figure which one
>
>Exactly; that is what the model theory says about all URIrefs, though, and 
>has always said about them, since they are always interpreted relative to 
>an interpretation.
>
>You tell me, how exactly does one 'figure out' what the URI 
>http://www.w3.org/2001/XMLSchema is "bound" to? Even a human being? Try 
>looking up "namespace" in http://www.m-w.com/netdict.htm and see what you 
>get, for a start.

One of the aspects of model theory that it's difficult for ordinary 
programmers to fully take on board is the idea of any denotation being with 
respect to a given interpretation, and the fact that there is (for MT) no 
single overriding interpretation that is the root of all truths.

I think I can see why MT needs to take this approach in the justification 
of proof techniques.  If something works for all possible interpretations, 
it must also work for any particular interpretation.

But the jobbing programmer in me doesn't need all this stuff:  for most 
practical work, especially at the level of Internet protocols, a URI is 
paired with ("bound to") a resource in Humpty-Dumpty fashion:  it means 
what I say it means (or what the specs say it means, or what it has been 
determined through interoperability testing that most developers think the 
specs say it means).  Implicitly, there is a single interpretation that 
defines the denotation of the URI.  For most practical purposes, I really 
don't want to be concerned with interpretations other than the one I "know" 
I'm concerned with.  On this basis, people have constructed a functioning 
Internet.

So how can we deal in terms of this "fiction" of a single interpretation in 
which URIs Identify (denote a single thing/concept) -- which I believe we 
must do if the specification is to have any traction with real developers 
-- without committing you to a belief in magic?  (In part, I agree with you 
to say as little as possible, but maybe a tiny bit more than you said -- 
per my proposed wording copied at the end of this message.)

>>, but this is starting to look to me like a problem looking for somewhere 
>>to happen.  Even if the generated name is truly unique, never to be 
>>returned again, other folks might still start trying to make assertions 
>>about it ("I think it's the first value that matches the query" and "I 
>>think it's the smallest value that matches the query", etc., so it starts 
>>to acquire an specificity that really was not intended for the query result.
>
>Yes, but that is true for all URIrefs, everywhere.  Anyone else can assert 
>that my webpage denotes the Queen.  But these URIrefs 'belong' to the 
>server, so any assertions made about them from another source are not 
>warranted by the server.  I don't see any problems arising here that 
>aren't endemic to the entire Web (and so are unlikely to be real problems, 
>in fact, since the Web seems to work quite well.)

Yes, I agree.  But somehow there is this expectation of a particular 
interpretation, which in many cases is widely-enough held to work for many 
practical applications...

>But if this bothers you, assume that the server knows that something 
>exists and that it is unique, i.e. that there is only one thing satisfying 
>the query (eg it might know that the set of Joe's siblings has cardinality 
>1).  Still, if that is ALL it knows, is it OK to say that this thing is 
>"identified", and that if the server makes up a URI for it, that the URI 
>is "bound" to it? How does one answer a question like that?

I think there is an expectation that URIs are usable outside of any context 
in which they may be generated.  (Though there is not, I think, any general 
expectation that any string that happens to follow URI syntax is usable as 
a URI.)

I'm not sure that this expectation holds for a query result.

The problem here, I guess, is that there's no *formal* way to distinguish 
these cases -- neither a URI or a query result are formally defined to 
denote a particular value, but there's an expectation in one case that 
affects the way that people want to use them differently.

>>Another example:  when CWM
>
>Well, CWM violates the RDF spec.
>
>>ses unbound graph nodes in queries, not quite the same as what you 
>>describe, it requires the names to be qualified with a log:forall 
>>qualifier, which is all beyond standard RDF as I understand it.
>
>Indeed, way beyond. And that seems to be about bnodes in the query (which 
>act like universals rather than existentials, in a query), not bindings to 
>query variables.
>
>>  There's also a log:forSome which may be used for the case of unbound 
>> query results, maybe similar to your DQL case.
>
>Well, certainly things would be easier for DQL if it were possible to 
>return an explicit existential, but the basic issue about URIs would still 
>remain. If you like, think of it as the question, is it kosher to 
>Skolemize on the Web? That is, can I make up a new URI and say that it 
>denotes something, just on the basis of knowing that something *exists*, 
>and not knowing anything else about it? If so, how do I "bind" this thing 
>- about which I know virtually nothing - to my URI, or make my URI 
>"identify" it?  Particularly if I can't identify it and I have nothing to 
>bind it with.

I admit I can't formally distinguish the cases.  But I'm not so comfortable 
with using URIs as Skolem constants.  I don't think we can yet declare a 
consensus about whether or not its kosher.

>>[[
>><SyntaxArc rdf:about="http://www.w3.org/2000/10/swap/log#forAll">
>><rdfs:comment>A is true for any object in place of B.</rdfs:comment>
>><rdfs:label>for All</rdfs:label>
>></SyntaxArc>
>>
>><rdf:SyntaxArc rdf:about="http://www.w3.org/2000/10/swap/log#forSome">
>><rdfs:comment>
>>A is true for some object for which here we use B. This
>>         is NOT a real rdf property in its behaviour - a pseudoproperty.
>>         For example, removal of it from a formula does nor
>>         preserve truth, and substitution is not permitted on its object.
>></rdfs:comment>
>><rdfs:domain rdf:resource="http://www.w3.org/2000/10/swap/log#Formula"/>
>></rdf:SyntaxArc>
>>]]
>>-- http://www.w3.org/2000/10/swap/log
>>
>>>If RFC 2396 rules that this is inappropriate then we will probably 
>>>redesign DQL.  Which we could do, with some pain, but I would like to 
>>>know clearly one way or the other.  If I read "anything that can be 
>>>identified ..." somewhere, with no further exposition, I still don't 
>>>know whether this is OK or not, because I don't know what "identified" 
>>>is supposed to mean. In DQL usage, it's certainly not what is meant by 
>>>using an 'identifier' in a programming language, it doesn't cause the 
>>>thing that exists to have an identity (if it didn't have one already), 
>>>and it doesn't imply any kind of binding-to going on anywhere.
>>
>>So I think there are two questions:
>>
>>(1) what is a resource?
>>(2) does a URI identifiy a single particular resource?
>>
>>I think the answer to (2) is "yes" by my understanding of URIs (e.g. 
>>RFC2396 section 1.1:  "An identifier is an object that can act as a 
>>reference to *something* [that has identity]."  Even if you ignore the 
>>problematic words [that has identity] (I think they're redundant here), I 
>>think the words still say that the identifier refers to a single 
>>entity:  "something" is singular.
>
>I still want to know what it means for something to be "identified". It 
>sounds like you are saying that it means that there is a single thing - an 
>actual thing, not a representation of a thing - which the URI has to 
>denote.  That is admittedly clear, but it has the disadvantage of being an 
>impossible fantasy.  If true, it would mean that URIs had magical properties.

Yes, formally, it's a fantasy -- not sustainable.  But it's a fantasy 
widely-enough believed to be of practical value.

>This 'unique referent' claim, if taken seriously, is an incredibly strong 
>claim. It seems to be predicated on an assumption about the Web which is 
>false of all other known representational and linguistic schemes, that 
>names are 'true names' which *inherently*, in their very nature, identify 
>a single thing in all possible interpretations; URIs, according to this, 
>are names with LOGICALLY NECESSARY denotations.  No other names are like 
>that, in any human or artificial language or naming scheme ever devised, 
>except maybe numerals (and even then not if you want to stay 
>computable.).  Even if you were to physically attach the names to their 
>intended referents, like name badges worn by people at a symposium, there 
>would still be some  ambiguity: does it denote the person at that moment, 
>the person considered as a citizen, the person's body, the person's 
>clothing, the role they are playing in the gaming convention....?? I know 
>there is an answer in the case of name badges, but the point is that this 
>answer depends on an external convention known to the users, an implicit 
>shared set of assumptions, a background. It's not inherent in the very 
>idea of a name badge. You CAN interpret name badges differently, and I 
>expect some symposia do. And as soon as you allow this kind of 
>contextuality, you lose uniqueness of denotation. But according to what 
>you say here, it is *logically impossible* to mis-interpret a URI.

I don't know if this is just word-play, but my view is not to attempt to 
claim a URI has a single denotation in all possible interpretations, but to 
suggest that there exists a particular intended interpretation, which 
provides a denotation for all URIs, and hence that the thing "identified" 
by a URI is that which it denotes in this particular interpretation.

It seems clear that we don't know enough to completely specify this 
interpretation, but on the other hand there are demonstrably a useful 
number of things we do agree on for it to be of some practical use.

#g
--

>Here's how this manifests itself in a web formalism, like RDF. Suppose I 
>make some RDF (or whatever) assertions about a thing using its URI. If 
>there is a single referent, it must be the same referent *in all possible 
>interpretations of my assertions*. So what I write just IS true or false 
>of that thing, and a reasoning engine ought to be able to find out which 
>*just by looking at the URI*.  But of course it can't possibly find it out 
>in that way, in general, even if we allow it to use Web machinery on the 
>URI; and the reason it can't is because this assumption is completely 
>false: what a URI denotes *depends on the interpretation*, just like names 
>and referring expressions in all the other languages and notational 
>schemes ever invented.
>
>This point is made even stronger if the things that one GETs by using a 
>URI are considered to be representations of resources, since then the 
>meaning of the representation depends on which semantic conventions are 
>applied to it.
>>
>>As for the answer to (1), I agree with you (modulo debatable edge-cases I 
>>don't understand) that a resource is, practically, anything.
>>
>>Did you have any problem with my suggested revision to your words, that 
>>were an attempt to overcome Roy's objection?
>
>You mean
>[[
>The entities referred to or identified by URIs when used in Web contexts 
>are called "resources".  Anything can be a resource:  there is no 
>restriction on the nature of resources."
>]]
>
>Fine with me.
>
>Pat
>--
>---------------------------------------------------------------------
>IHMC                                    (850)434 8903 or (650)494 3973   home
>40 South Alcaniz St.                    (850)202 4416   office
>Pensacola                               (850)202 4440   fax
>FL 32501                                        (850)291 0667    cell
>phayes@ai.uwf.edu                 http://www.coginst.uwf.edu/~phayes
>s.pam@ai.uwf.edu   for spam

-------------------
Graham Klyne
<GK@NineByNine.org>
PGP: 0FAA 69FF C083 000B A2E9  A131 01B9 1C7A DBCA CB5E

Received on Monday, 28 April 2003 10:05:38 UTC