RE: Resources and URIs from Patrick.Stickler@nokia.com on 2003-04-28 (uri@w3.org from April 2003)

From: <Patrick.Stickler@nokia.com>
Date: Mon, 28 Apr 2003 14:54:38 +0300
To: <phayes@ai.uwf.edu>, <gk@ninebynine.org>
Cc: <uri@w3.org>
Message-ID: <A03E60B17132A84F9B4BB5EEDE57957B5FBBA1@trebe006.europe.nokia.com>
> You tell me, how exactly does one 'figure out' what the URI 
> http://www.w3.org/2001/XMLSchema is "bound" to? Even a human being? 
> Try looking up "namespace" in http://www.m-w.com/netdict.htm and see 
> what you get, for a start.

This is what I have been working on URIQA to do. Given a URI, you can
ask a SW-enabled server (either the same server that provides representations
via HTTP GET, or another server, via an explicit URIQA portal) to tell
you about the resource, and the knowledge about the resource may (hopefully,
will) enable you to clarify what the URI denotes, what that resource
actually is. At the very least, it should tell you a bit about the nature
of the resource denoted by the URI.

E.g. what does the following URI denote?

  http://forum.nokia.com/product/terminal/N6310r100

(at the moment, there are no representations, so GET returns 404)

You can't (and no fair peeking inside the URI to guess from any 
possible mnemonic interpretations of any substrings therein, for
all you know it could be a "terminal (failed) product" ;-)

But you can ask the Nokia Semantic Web server for a description
of the resource

   http://sw.nokia.com/swe/URIQA?uri=http://forum.nokia.com/product/terminal/N6310r100

and you find out that it is a Wireless Terminal, along with other information
such as its UAProf profile.

Soon, you will be able to ask the authoritative server forum.nokia.com directly 
about its resources and not have to use a proxy URIQA agent (though this also
demonstrates how both authoritative knowledge from the same server
that provides representations and third party knowledge from other
servers with URIQA portals can work in unison to serve knowledge about
resources on the SW.

(BTW, this is a work in progress, so don't be surprised if it pukes
now and then, it's not yet ready for prime time... ;-)

> Yes, but that is true for all URIrefs, everywhere.  Anyone else can 
> assert that my webpage denotes the Queen.  

How would they do that? *IN* RDF I mean. 

You'd at least have to have some other URI that denotes the Queen,
and say something like

   x:PatsWebPage owl:sameAs y:TheQueen .

But then, how does one define the denotation of y:TheQueen?

At some point, we have to accept that the machinery for actually defining
the URI to resource mapping is outside the scope of the SW machinery.
You've got to have primitives at some foundational layer, and for RDF,
those primitives are the URIs. How those URIs are mapped to resources
is based on processes ouside of RDF. No?

> But these URIrefs 'belong' 
> to the server, so any assertions made about them from another source 
> are not warranted by the server.  I don't see any problems arising 
> here that aren't endemic to the entire Web (and so are unlikely to be 
> real problems, in fact, since the Web seems to work quite well.)

Insofar as http: and similar URIs are concerned, which include a
web authority component, I fully agree. The web authority component
serves as a basis for distinguishing authoritative knowledge about
a resource from third party knowledge. 

(Whether a SW agent ranks any third party knowledge the same as or
even higher than the authoritative knowledge is an entirely different
issue)

> ...If you like, think of it as the question, is it kosher 
> to Skolemize on the Web? That is, can I make up a new URI and say 
> that it denotes something, just on the basis of knowing that 
> something *exists*, and not knowing anything else about it? If so, 
> how do I "bind" this thing - about which I know virtually nothing - 
> to my URI, or make my URI "identify" it?  Particularly if I can't 
> identify it and I have nothing to bind it with.

This is why I consider the binding machinery to be below the level
of primitives for RDF and the SW. SW agents need not be concerned with
how a URI is bound or mapped to a thing in the real world, only that
when it encounters a given URI, it can presume it consistently refers
to the same thing (not that it always *will*, after all, shit happens,
but that it can -- and should -- *presume* that it does).

> >So I think there are two questions:
> >
> >(1) what is a resource?
> >(2) does a URI identifiy a single particular resource?
> >
> >I think the answer to (2) is "yes" by my understanding of URIs (e.g. 
> >RFC2396 section 1.1:  "An identifier is an object that can act as a 
> >reference to *something* [that has identity]."  Even if you ignore 
> >the problematic words [that has identity] (I think they're redundant 
> >here), I think the words still say that the identifier refers to a 
> >single entity:  "something" is singular.

Graham, do you mean here that, at least by design, URIs should not be overloaded
to denote more than one thing?

> I still want to know what it means for something to be "identified". 
> It sounds like you are saying that it means that there is a single 
> thing - an actual thing, not a representation of a thing - which the 
> URI has to denote. 

I don't know if that's what Graham meant, but I certainly think so.

A representation of a thing is also a thing. And a thing and
its representation must both have distinct URIs if we are to talk
about both of them unambiguously.

One should not expect a given URI to denote both a thing and a
representation of that thing, though this is a common error on the
Web as folks tend to relate what they GET from a server with the
thing denoted by the URI used with GET -- where in fact, what is
actually denoted by the URI is often, if not usually, not the same
resource returned by GET. Clearly, there is a common disconnect between
perception and architecture.
 
> That is admittedly clear, but it has the 
> disadvantage of being an impossible fantasy.  If true, it would mean 
> that URIs had magical properties.
> 
> This 'unique referent' claim, if taken seriously, is an incredibly 
> strong claim. It seems to be predicated on an assumption about the 
> Web which is false of all other known representational and linguistic 
> schemes, that names are 'true names' which *inherently*, in their 
> very nature, identify a single thing in all possible interpretations; 

The RDF graph merge function clearly IMO reflects this assumption.

If nodes in an RDF graph represent resources, and two nodes from
different graphs are merged to a single node because they have the
same URI, then that reflects the assumption that they denote the
same thing.

My understanding of (one of) the key points of the SW is to have
a global knowledge base whereby arbitrary agents can interchange
knowledge and reason about the same things *consistently*, insofar
as the knowledge is concerned. I.e., that URIs would always have
a consistent interpretation, globally.

If that is not the case, then I don't see the point of the SW.
What we end up with is simply many closed systems that simply use
the same infrastructure, but cannot reliably interchange knowledge,
since there is not even the *presumption* that there would be
consistent interpretation of URI denotations.

> URIs, according to this, are names with LOGICALLY NECESSARY 
> denotations.  No other names are like that, in any human or 
> artificial language or naming scheme ever devised, except maybe 
> numerals (and even then not if you want to stay computable.).  

Firstly, please let's leave natural language out of this discussion.
I don't think it is productive to compare RDF with natural
languages. One of the purposes of formal languages is to escape
the many varied quagmires of ambiguity and imprecision inherent
in natural language. The comparison is not useful. Just becuause
natural language has certain inherent characteristics does not
mean that RDF or the SW must also embody them.

Secondly, is there any reason why URIs should not be innovative?
Why should any other artificial language have formerly employed
such a naming scheme in order for the SW to employ URIs in 
that manner. Is there some inherent pitfall or fallacy in doing so?
Surely progress is all about doing something new and different.
If we are restricted only to what has been done before, then how
do we progress to anything better?

> Even 
> if you were to physically attach the names to their intended 
> referents, like name badges worn by people at a symposium, there 
> would still be some  ambiguity: does it denote the person at that 
> moment, the person considered as a citizen, the person's body, the 
> person's clothing, the role they are playing in the gaming 
> convention....?? I know there is an answer in the case of name 
> badges, but the point is that this answer depends on an external 
> convention known to the users, an implicit shared set of assumptions, 
> a background. It's not inherent in the very idea of a name badge. You 
> CAN interpret name badges differently, and I expect some symposia do. 
> And as soon as you allow this kind of contextuality, you lose 
> uniqueness of denotation. But according to what you say here, it is 
> *logically impossible* to mis-interpret a URI.

(Sorry Graham for continuing to jump in here, I can't help myself ;-)

I myself wouldn't argue that it is impossible for overloading to
occur, where different *users* presume a different denotation, and
eventually, we can hope to have sufficiently intelligent SW agents
to be able to identify and even cope with such overloading, but
from the viewpoint of a SW agent which is not sufficiently intelligent,
it has at present no recourse to do anything but presume that every
time it sees the same URI that it denotes the same thing. If
overloading occurs, that may result in inferences that do not correspond
to the real world. Cest la vie. The users then need to fix the
problem, which is the overloading of denotation.

Thus, from the SW agent's perspective, yes, it seems as if it is
logically impossible to mis-interpret a URI, since a URI would
be presumed to have a single, globally consistent denotation.

Looking at the system from the outside, as a user, however, then
one can say that insofar as the specifications/presumptions of
the denotations are concerned, mis-interpretation by the users,
and subsequent mis-use by users, is certainly possible.

One simply needs to keep clear whether it is the SW agents within
the scope of the SW machierny or the users outside the scope of the SW 
machinery which are doing the interpretation. SW agents cannot 
mis-interpret, since they have no access to the sub-primitive machinery 
used to define URI to resource mappings. Users can however mis-interpret and
subsquently introduce those mis-interpretations into the SW as
overloading of denotation, which results in ambiguity that can
cause unexpected inferences.

> Here's how this manifests itself in a web formalism, like RDF. 
> Suppose I make some RDF (or whatever) assertions about a thing using 
> its URI. If there is a single referent, it must be the same referent 
> *in all possible interpretations of my assertions*. So what I write 
> just IS true or false of that thing, and a reasoning engine ought to 
> be able to find out which *just by looking at the URI*. 

I don't see how it can find out anything by just looking at the URI
in isolation, since that URI is a primitive and it has no access
to the machinery by which the URI is mapped to the actual thing.

> But of 
> course it can't possibly find it out in that way, in general, even if 
> we allow it to use Web machinery on the URI; and the reason it can't 
> is because this assumption is completely false: what a URI denotes 
> *depends on the interpretation*, just like names and referring 
> expressions in all the other languages and notational schemes ever 
> invented.
> 
> This point is made even stronger if the things that one GETs by using 
> a URI are considered to be representations of resources, since then 
> the meaning of the representation depends on which semantic 
> conventions are applied to it.

Why? If you have a URI, and that URI denotes a resource, but you have
no idea what that resource is, and you give that URI to an HTTP server
as part of a GET request, there is no guaruntee whatsoever that what
you get back will help you in the least in figuring out what resource
is actually denoted by that URI.

Insofar as the Web is concerned, that doesn't matter. Folks just
want to get useful content and are not concerned with the level of
precision of denotation that the SW requires. It's not surprising that Web
users frequently confuse the denotation of URIs between resources
and representations of those resources, or even other things
indirectly referred to or depicted in the representations. 

But just because (a) the Web does not need the level of precision
that the SW needs, and (b) just because human users of the Web
can deal with overloading of URI denotation, does not mean that
the SW should either not care about overloading of denotation or
be expected to be able to deal with it (at this stage).

> >As for the answer to (1), I agree with you (modulo debatable 
> >edge-cases I don't understand) that a resource is, practically, 
> >anything.

I certainly agree with the assertion that a resource can be anything
(forget the 'practically').

Patrick

--
Patrick Stickler, Nokia/Finland, (+358 40) 801 9690, patrick.stickler@nokia.com
Received on Monday, 28 April 2003 07:54:43 UTC