Re: Resources and URIs

>This message does not contribute directly to the RFC2396bis wording debate.
>
>I'm trying to explore, in response to Pat's comments, reconciliation 
>of formal notions of denotation with the commonly understood idea of 
>URIs as identifiers.

In brief: One cannot simultaneously maintain BOTH that URIs denote 
things in the world (not just inside computers), AND that the 
relationship between URIs and what they denote is similar to the 
relationship between identifiers in programming languages and what 
they conventionally identify.  Because the second 'identifier' idea 
relies on a basic property of computable domains which is not true in 
the wider world, viz. that any set of equations over such a domain 
has a unique minimal solution which is computably enumerable.  Most 
of the universe is not like this.  So if this is really 'commonly 
understood', then there is a serious mismatch between this common 
understanding and the RFC 2396 notion of the meaning of a URI. 
Either the Web is only about (semi-)computable things, or Web 
reference is not (semi-)computable.  Someone decide, but don't try to 
pretend that you can have it both ways at once.

>At 15:32 25/04/2003 -0500, pat hayes wrote:
>
>>>[Offlist... deviating]
>>>
>>>At 13:19 24/04/2003 -0500, you wrote:
>>>>>At 16:51 23/04/2003 -0700, Roy T. Fielding wrote:
>>>>>>>"This document specifies the syntax of URIs, which are a form 
>>>>>>>of global identifier used in Web protocols and languages. 
>>>>>>>Particular uses of URIs, and their intended meanings in 
>>>>>>>various contexts, are described in other specifications. In 
>>>>>>>general, the entities referred to or identified by URIs when 
>>>>>>>used in Web contexts are called "resources"., but this 
>>>>>>>document does not specify the nature of resources or to 
>>>>>>>restrict resources to any particular category of entities."
>>>>>>>
>>>>>>>and leave it at that.  Nothing else at all about resources, no 
>>>>>>>examples, no discussion.
>>>>>>
>>>>>>No.  Look, you guys aren't the ones who have to answer questions in the
>>>>>>absence of definitions.  I do.    I refuse to leave what has 
>>>>>>been deployed
>>>>>>in an unspecified state, regardless of how many arguments that causes
>>>>>>in the Semantic Web.
>>>>>
>>>>>OK.  I assume this is in response to the phrase "this document 
>>>>>does not specify ...".
>>>>>
>>>>>I think the argument about 'refer' vs 'identify' is a bit 
>>>>>sterile, because I think I can supply an identifier for anything 
>>>>>that Pat can refer to, and I think that anything with an 
>>>>>identifier can be said to have identity (the identifier being 
>>>>>sufficient if not necessary); and clearly anything identified 
>>>>>can be referenced.
>>>>
>>>>Well, I'm worried about things like descriptions. Here's a 
>>>>topical example. In designing DQL we had to consider the case 
>>>>where a server is able to prove that that something exists which 
>>>>satisfies the query but it has no URI to hand back as a binder to 
>>>>the must-bind variable in the query pattern.  Now, DQL can make 
>>>>up a URIreference of its own to be the 'name' of the thing it 
>>>>knows must exist. Fine so far, but is it really correct to say 
>>>>that this URI *identifies* that thing? I honestly do not know, 
>>>>but I am worried that many folk (eg Michael Mealling 
>>>><michael@neonym.net>, at a guess) will read that as meaning 
>>>>something much stronger, for example as meaning that the URIref 
>>>>has been 'bound' to the thing in some sense (I'm not sure what 
>>>>this means, but it sounds a lot tighter than merely being used to 
>>>>refer to, in a query transaction).  With that understanding, what 
>>>>the DQL server does in this case could be an incorrect or 
>>>>inappropriate use of URIrefs, or 'gibberish', in Michael's phrase.
>>>
>>>I don't perceive this as a "what is a resource" issue so much as a 
>>>(possible) violation of the expectation that a given URI 
>>>identifies some particular resource.
>>
>>But look, I'm trying to find out what that MEANS.  All these words 
>>like "identify", "bound to" are terms of art which aren't being 
>>defined.  What am I committing to if I agree that a URI 
>>"identifies" something? I'm pretty sure that Michael and other folk 
>>think I'm committing to a whole lot more than I want to commit to. 
>>For example, if it means that for each URI there has to be one 
>>single thing which that URI must denote, then I certainly don't 
>>want to commit to that (see later for why: basically, this is a 
>>claim that URIs have magical powers.).
>
>Oops, in that message, to you, I should have said "denotes".  (I was 
>trying to separate the nature-of-resource question from 
>what-is-identity question, and failed.)

OK, lets try to keep them separate; although I think they have been 
thoroughly mixed up already, before the SW came along.

>As for the uniqueness of denotation... I think I begin to see the 
>problem here.  .....
>
>>In the case in question, the server knows that there is a thing 
>>that satisfies the query; but that is ALL it knows.  It can't 
>>'identify' a particular thing, it can't access it, it has no 
>>representation of it (other than the assertion in the query pattern 
>>itself), there is no web-accessible thing that can be said to be 
>>'about' it:  none of that at all, nothing that fits into the REST 
>>model. It just knows that something exists. Now, is that enough to 
>>say that it has 'identified' that thing (about which it knows 
>>diddly-squat other than that it exists) ?? Or not?? I just want a 
>>clear answer one way or the other, is all.  I want to know what 
>>this word "identifies" is supposed to mean.
>>
>>Note, it would be wrong to say that there is a single unique thing 
>>and the server can 'identify' that thing in the sense of single it 
>>out from all other things. It can't. That is a *much* stronger 
>>claim than it's knowing that there is only one such thing, eg 
>>consider the difference between 'Someone in this room murdered the 
>>Count' and (pointing at someone) 'That person murdered the Count'.
>>
>>>  Isn't this why we introduced bnodes distinct from URIs?
>>
>>For use in reasoning, yes, but there is no way to pass a bnode 
>>outside its scope as an answer binding (and it would be meaningless 
>>to do so); and the only kinds of bindings we have are URIrefs.
>
>In logic, I think one uses expressions like (exists x . f(x)) to express such.

No. This is an aside, but that expresses a *proposition*, not an 
anonymous identification. There isn't anything in conventional logic 
to do this any better than using a 'new logical name' already does 
it, in fact. In English you might say "something" or use tricks like 
"John Doe" to indicate that you aren't meaning to identify anyone in 
particular. We thought of allowing a special DQL bnode-ish construct 
to indicate an unknown thing 
(http://www.daml.org/listarchive/joint-committee/0897.html), but in 
fact I think that Skolemized URIs work just fine. The existing Web 
protocols and conventions are perfectly adequate, in fact, to handle 
this. They are BETTER than the current RFC 2396 prose says they are.

>    Is a query answer is an RDF graph with bnodes, I interpret that 
>as meaning much the same thing -- I'm not asking to export the 
>binding, just express the query result.  My concern is that URIs are 
>not the right form to denote some elements of query results, because 
>some parts of query results are not "resources" in the intended 
>sense.

Well, that is what bothers me, to get back to the point. ARE they, in 
the intended sense? What IS the intended sense? I can read the prose 
either way as it stands.

>>>If there were exactly one thing that satisfied the query, then I 
>>>don't see the generated URI being problematic, but having it stand 
>>>for any of some collection of possible (query result) values then 
>>>it's more questionable.
>>
>>It doesn't stand for the collection, it stands for a thing known to 
>>exist.  Of course if there were more than one of them, then it is 
>>ambiguous; but since it is ambiguous anyway, that hardly seems to 
>>matter.
>
>I don't see that contradicting what I'm trying to say.
>
>>>  I suppose you might finesse the meaning by saying its bound to 
>>>some particular value without providing any further means to 
>>>figure which one
>>
>>Exactly; that is what the model theory says about all URIrefs, 
>>though, and has always said about them, since they are always 
>>interpreted relative to an interpretation.
>>
>>You tell me, how exactly does one 'figure out' what the URI 
>>http://www.w3.org/2001/XMLSchema is "bound" to? Even a human being? 
>>Try looking up "namespace" in http://www.m-w.com/netdict.htm and 
>>see what you get, for a start.
>
>One of the aspects of model theory that it's difficult for ordinary 
>programmers to fully take on board is the idea of any denotation 
>being with respect to a given interpretation, and the fact that 
>there is (for MT) no single overriding interpretation that is the 
>root of all truths.

Its not 'for MT', its for LANGUAGE. (eg 
http://cognet.mit.edu/MITECS/Entry/roberts2)

>I think I can see why MT needs to take this approach in the 
>justification of proof techniques.  If something works for all 
>possible interpretations, it must also work for any particular 
>interpretation.
>
>But the jobbing programmer in me doesn't need all this stuff:  for 
>most practical work, especially at the level of Internet protocols, 
>a URI is paired with ("bound to") a resource in Humpty-Dumpty 
>fashion:  it means what I say it means (or what the specs say it 
>means, or what it has been determined through interoperability 
>testing that most developers think the specs say it means).

Well, Ive been a jobbing programmer in my time, and I don't think 
this is true at all.  What is true is that one often thinks of one's 
identifiers (in code) as having particular values in a particular 
environment, eg data structures or whatever; and the semantic rules 
for programming languages tell you how this is possible, because they 
are defined over closed domains which satisfy the second recursion 
theorem (all recursive equations have a UNIQUE least fixed point.) 
But that only applies to the 'internals' of your code; it doesn't 
apply as soon as you ask what those datatstructures REFER to. Most 
jobbing programmers are capable of either not giving damn about that, 
or more often being quite hair-raisingly casual about having their 
datatstructures denote a whole slew of things at the same time, with 
the choice determined by context. I bet it would be almost impossible 
to write a compiler if you tried to keep telling yourself that every 
identifier denoted some particular thing and you had to always make 
sure it stuck to that one thing.

>Implicitly, there is a single interpretation that defines the 
>denotation of the URI.

People keep SAYING that, but I havn't seen a single ARGUMENT for it. 
For most URIs, trying to insist that there is one thing they denote, 
and then to argue about what exactly it is, has just given rise to 
long, energetic and irreconcilable debates.  Meanwhile, 'ol man Web 
keeps rolling along, and these debates are irrelevant to it.  How 
does it DO that, I wonder? Does it know something that we don't know? 
Or maybe it doesn't care? If so, why are WE arguing about it?

>For most practical purposes, I really don't want to be concerned 
>with interpretations other than the one I "know" I'm concerned with. 
>On this basis, people have constructed a functioning Internet.

The fact that the Internet functions is an indication that even 
though one person may be using a URI with one thing in mind, and 
another person using it with another thing in mind, the variations in 
these interpretations somehow do not matter. The same thing happens 
in normal language. This is a very interesting fact that deserves to 
be analysed carefully.  But just DECLARING that 'URIs have a unique 
denotation, and that is why it all works' isn't a theory.  In order 
for this to explain why the Web works, you would have to explain why 
it is that *even though the interpretation you had in mind* wasn't 
what the URI *actually* denoted, that it still managed to work. Or 
else you have to explain how it comes about that the interpretation 
you have in mind just IS the right one, which sounds like magic to 
me, or maybe e-telepathy.

>So how can we deal in terms of this "fiction" of a single 
>interpretation in which URIs Identify (denote a single 
>thing/concept) -- which I believe we must do if the specification is 
>to have any traction with real developers

I don't believe this for a second.  In fact, I think that most 
developers would say that it is a fantasy (or irrelevant), if the 
notion of denotation were explained to them. Its a fantasy for any 
program which claims to manipulate denoting expressions. Its like 
saying that you can't wrote DB code unless you believe that all the 
table entries are somehow magically bound to the things they are 
being used to say things about.

What does need to be explained is that denotation doesn't work 
everywhere the same way that it does in programming languages, 
because the entire world isn't like the inside of a computer: even a 
jobbing programmer needs to bear that in mind from time to time.

>-- without committing you to a belief in magic?  (In part, I agree 
>with you to say as little as possible, but maybe a tiny bit more 
>than you said -- per my proposed wording copied at the end of this 
>message.)
>
>>>, but this is starting to look to me like a problem looking for 
>>>somewhere to happen.  Even if the generated name is truly unique, 
>>>never to be returned again, other folks might still start trying 
>>>to make assertions about it ("I think it's the first value that 
>>>matches the query" and "I think it's the smallest value that 
>>>matches the query", etc., so it starts to acquire an specificity 
>>>that really was not intended for the query result.
>>
>>Yes, but that is true for all URIrefs, everywhere.  Anyone else can 
>>assert that my webpage denotes the Queen.  But these URIrefs 
>>'belong' to the server, so any assertions made about them from 
>>another source are not warranted by the server.  I don't see any 
>>problems arising here that aren't endemic to the entire Web (and so 
>>are unlikely to be real problems, in fact, since the Web seems to 
>>work quite well.)
>
>Yes, I agree.  But somehow there is this expectation of a particular 
>interpretation, which in many cases is widely-enough held to work 
>for many practical applications...
>
>>But if this bothers you, assume that the server knows that 
>>something exists and that it is unique, i.e. that there is only one 
>>thing satisfying the query (eg it might know that the set of Joe's 
>>siblings has cardinality 1).  Still, if that is ALL it knows, is it 
>>OK to say that this thing is "identified", and that if the server 
>>makes up a URI for it, that the URI is "bound" to it? How does one 
>>answer a question like that?
>
>I think there is an expectation that URIs are usable outside of any 
>context in which they may be generated.

I can go with that, provided we allow 'useable' to have a wide enough 
sense.  It works for the DQL skolemization case, for example.  Being 
useable is not the same as having a unique referent 'bound' to it.

>(Though there is not, I think, any general expectation that any 
>string that happens to follow URI syntax is usable as a URI.)
>
>I'm not sure that this expectation holds for a query result.
>
>The problem here, I guess, is that there's no *formal* way to 
>distinguish these cases -- neither a URI or a query result are 
>formally defined to denote a particular value, but there's an 
>expectation in one case that affects the way that people want to use 
>them differently.
>
>>>Another example:  when CWM
>>
>>Well, CWM violates the RDF spec.
>>
>>>ses unbound graph nodes in queries, not quite the same as what you 
>>>describe, it requires the names to be qualified with a log:forall 
>>>qualifier, which is all beyond standard RDF as I understand it.
>>
>>Indeed, way beyond. And that seems to be about bnodes in the query 
>>(which act like universals rather than existentials, in a query), 
>>not bindings to query variables.
>>
>>>  There's also a log:forSome which may be used for the case of 
>>>unbound query results, maybe similar to your DQL case.
>>
>>Well, certainly things would be easier for DQL if it were possible 
>>to return an explicit existential, but the basic issue about URIs 
>>would still remain. If you like, think of it as the question, is it 
>>kosher to Skolemize on the Web? That is, can I make up a new URI 
>>and say that it denotes something, just on the basis of knowing 
>>that something *exists*, and not knowing anything else about it? If 
>>so, how do I "bind" this thing - about which I know virtually 
>>nothing - to my URI, or make my URI "identify" it?  Particularly if 
>>I can't identify it and I have nothing to bind it with.
>
>I admit I can't formally distinguish the cases.  But I'm not so 
>comfortable with using URIs as Skolem constants.  I don't think we 
>can yet declare a consensus about whether or not its kosher.

Im willing to declare that it is, because it works and it doesn't 
violate any *technical* part of any W3C spec. If it violates some 
philosophical part of a spec, then as a user I don't really care. (As 
a spec editor I do, of course, but for different reasons.)

.....

>>
>>I still want to know what it means for something to be 
>>"identified". It sounds like you are saying that it means that 
>>there is a single thing - an actual thing, not a representation of 
>>a thing - which the URI has to denote.  That is admittedly clear, 
>>but it has the disadvantage of being an impossible fantasy.  If 
>>true, it would mean that URIs had magical properties.
>
>Yes, formally, it's a fantasy -- not sustainable.  But it's a 
>fantasy widely-enough believed to be of practical value.

So the business of writing specs is not to describe, or even to 
prescribe, but to offer parables and fantasies? If so, then I insist 
on my right to treat them as fiction and still say that I am 
conforming to them.

The general tendency in science since the 17th century suggests that 
it is usually better to admit doubt and look for explanations than to 
assert known fantasies as truth.

>>This 'unique referent' claim, if taken seriously, is an incredibly 
>>strong claim. It seems to be predicated on an assumption about the 
>>Web which is false of all other known representational and 
>>linguistic schemes, that names are 'true names' which *inherently*, 
>>in their very nature, identify a single thing in all possible 
>>interpretations; URIs, according to this, are names with LOGICALLY 
>>NECESSARY denotations.  No other names are like that, in any human 
>>or artificial language or naming scheme ever devised, except maybe 
>>numerals (and even then not if you want to stay computable.).  Even 
>>if you were to physically attach the names to their intended 
>>referents, like name badges worn by people at a symposium, there 
>>would still be some  ambiguity: does it denote the person at that 
>>moment, the person considered as a citizen, the person's body, the 
>>person's clothing, the role they are playing in the gaming 
>>convention....?? I know there is an answer in the case of name 
>>badges, but the point is that this answer depends on an external 
>>convention known to the users, an implicit shared set of 
>>assumptions, a background. It's not inherent in the very idea of a 
>>name badge. You CAN interpret name badges differently, and I expect 
>>some symposia do. And as soon as you allow this kind of 
>>contextuality, you lose uniqueness of denotation. But according to 
>>what you say here, it is *logically impossible* to mis-interpret a 
>>URI.
>
>I don't know if this is just word-play, but my view is not to 
>attempt to claim a URI has a single denotation in all possible 
>interpretations, but to suggest that there exists a particular 
>intended interpretation, which provides a denotation for all URIs, 
>and hence that the thing "identified" by a URI is that which it 
>denotes in this particular interpretation.

I agree that makes sense, but its just as fantastic, and just as 
corrosive to semantic analysis. It means that A entails B just when B 
is true in the single intended interpretation, for example, so that 
inference engines should ignore any facts they may have been told, 
and just read off their conclusions from the single True 
Interpretation. That is like telling the inference engines to seek 
enlightenment through prayer.

>It seems clear that we don't know enough to completely specify this 
>interpretation, but on the other hand there are demonstrably a 
>useful number of things we do agree on for it to be of some 
>practical use.

The current MT semantic framework expresses this perfectly. There are 
a number of things we agree on.  Express those things somehow, and 
they amount to some assertions which we both can agree to accept as 
true. We have a "common ground", and we have agreed to limit the 
interpretations to those that keep it true. The practical use arises 
right here: we are both able to draw the same conclusions.  This is 
how most communication works, maybe ALL communication, in fact.  But 
we don't need to have so much agreement that between us that we have 
managed to pin down a SINGLE interpretation. If we did need to do 
that it would get in the damn way, we would be always arguing over 
*exactly* what one another meant (just like we do on these mailing 
lists, but all the time about everything) , we wouldnt be able to 
move until we were SURE that the thing you were referring to was 
EXACTLY what I was referring to, and so on. We don't get stuck like 
this because we don't need one globally perfect alignment of 
reference: we just need a good enough alignment for the task in hand. 
(This last isn't just a philosophical claim, by the way: it is easy 
to show empirically that people who are cooperating in practical 
tasks are often systematically misunderstanding their intended 
referents without even knowing that they are, because of course it 
doesn't matter.)

It takes  LOT of asserting to pin down a single unique referent, even 
if it can be done at all. In actual human discourse, when it is 
really needed, we resort to legal language to do this, which is 
long-winded, highly technical, very hard to get right and hard to 
follow (and even then it might not, in fact, pin down a single unique 
referent.)  Most of the time we don't bother, and most of the time it 
doesn't matter.  Same for the Web, I suggest.

Pat


-- 
---------------------------------------------------------------------
IHMC					(850)434 8903 or (650)494 3973   home
40 South Alcaniz St.			(850)202 4416   office
Pensacola              			(850)202 4440   fax
FL 32501           				(850)291 0667    cell
phayes@ai.uwf.edu	          http://www.coginst.uwf.edu/~phayes
s.pam@ai.uwf.edu   for spam

Received on Monday, 28 April 2003 16:32:08 UTC