RE: exploring ambiguity via the "something-which-has" URI scheme from Patrick.Stickler@nokia.com on 2003-05-05 (uri@w3.org from May 2003)

From: <Patrick.Stickler@nokia.com>
Date: Mon, 5 May 2003 12:55:28 +0300
To: <sandro@w3.org>
Cc: <uri@w3.org>, <GK@ninebynine.org>, <phayes@ai.uwf.edu>
Message-ID: <A03E60B17132A84F9B4BB5EEDE57957B5FBBB3@trebe006.europe.nokia.com>
> -----Original Message-----
> From: ext Sandro Hawke [mailto:sandro@w3.org]
> Sent: 02 May, 2003 19:29
> To: Stickler Patrick (NMP/Tampere)
> Cc: uri@w3.org; GK@ninebynine.org; phayes@ai.uwf.edu
> Subject: Re: exploring ambiguity via the "something-which-has" URI
> scheme 
> 
> 
> 
> Patrick Stickler writes
> > This appears to be a URI scheme for expressing constellations
> > of properties defined for an anonymous node (and not very unlike
> > open-URIs).
> 
> I've never heard of "open-URIs" and I can't find anything relevant in
> google.  Pointer?  It certainly seems similar to the "secure URI"
> thread [1] going on in this list .  I was invoking RDF as a way to to
> jump to the end of the arms-race for more general languages.

Sorry. The name is "Open URL" (not URI). 

C.f. http://www.dlib.org/dlib/march01/vandesompel/03vandesompel.html

> > I would think that what such URIs really denote would be the
> > *class* of all resources which match those specified properties.
> 
> Ah, the ambiguity of natural language.  Yes, from my examples and the
> way I named the scheme that would be a perfectly reasonable
> interpretation.  As would the idea you feared, that I was proposing a
> URI scheme where URIs did not have a single denotation.  But I defined
> in otherwise.  I thought my text made that clear.
> 
> One the first point:
> 
> Let's call the two versions x-scheme-1 and x-scheme-2.  Each
> x-scheme-1 URI denotes something for which the given RDF graph is
> true.  Each x-scheme-2 URI denotes the class of things for which the
> given RDF graph is true.  
> 
>    
> <x-scheme-1:foaf_mbox='connolly@w3.org';defpre(foaf_,http://xm
> lns.com/foaf/0.1/)>
> 
> denotes Dan Connolly, while
> 
>    
> <x-scheme-2:foaf_mbox='connolly@w3.org';defpre(foaf_,http://xm
> lns.com/foaf/0.1/)>
> 
> denotes the (singleton) class of things which are Dan Connolly.
> 
> Assuming the empty graph (always true) is written as the empty string,
> and we can conclude:
> 
>    <x-scheme-1:> a rdf:Resource.      # it's something, but we
> 				      # have zero clue which thing
> 
>    <x-scheme-2:> a rdfs:Class.        # we at least know it's a class
> 
> 
> I'm defining my strawman something-which-has URI scheme to be
> x-scheme-1.  

What happens then, when there occur more than one thing that satisfy
the properties you specify in the URI?


> (There's a naming challenge in making this clear in the scheme name.
> Any grammatical phrase, like "individual-such-that" can still be read
> as "(the class of) individual(s) such that".  Recent programming
> conventions make the distinction using letter-case: must of us would
> guess "redThing" denotes an individual which is red, "RedThing"
> denotes a class of red things.  I also like the convention "redThings"
> would denote a collection of individuals which are red, but sometimes
> redThingList or redThingSet is necessary.)
> 
> One the second point: Consider the noun phrase "a red thing" in "I am
> holding a red thing in my hand."  That noun phrase could denotes any
> of a huge variety of things, but I'm using it to refer to a very
> specific physical object.  URIs always refer to a specific thing,
> even when no one happens to know anything about that thing.

Well, that seems to be the crux of a long running debate. 

If a URI is overloaded, then it does not refer to a specific thing.

Even if a specific SW agent considering a specific RDF graph might
presume (and IMO rightly so) that a given URI denotes a specific
thing, the source of its knowledge as captured in that graph may
have been syndicated from two or more sources which ultimately
disagree about the actual denotation of the URIs used, but that
disagreement is not and cannot be reflected in the URIs themselves.

Thus, you may "this red thing" and I may say "this red 
thing" but those may very well be different red things, yet your
URI scheme will result in the overloading of a URI reflecting
"a red thing" and thus result in confusion.

Better to just use a uuid: URI to denote the specific resources
and then describe them accordingly. If if is ever determined that
they are the same thing, we can use owl:sameAs to equate the two
UUID denoted resources.

UUIDs are nice in that they provide all the flexibility of anonymous
nodes but without them actually being anonymous. In fact, I once
suggested that we have a special URI scheme for RDF anonymous nodes
which is essentially identical to the uuid: scheme but called anon:
and which would be used to denote anonymous nodes, rather than
using just local system-specific identifiers. 

Anyway....

> > I think that it is (or should be) a fundamental presumption that,
> > within the scope of the S/Web, a given URI consistently denotes
> > a single thing.
> 
> Agreed.  I think Graham and Pat and you and I are in violent agreement
> on this on this list.

Yes. And alot of the violence is my fault, being insane enough to 
conduct discussions regarding logic in semantics using only the
English language rather than more precise mathematical terms ;-)

> But there are some tricky edge cases, which is what I am trying to
> flesh out.  In particular, while agents may act as if a URI had one
> true interpretation, they are only acting: we cannot, in general,
> communicate interpretations.  At best, we can arbitrarily constraint
> interpretations; that seems to be good enough for both humans and
> machines.

Right.

But in communicating about things, it seems to be best practice
to choose names that have the greatest change of being recognized
as widely as possible with the same denotation and have the
smallest chance of being overloaded.

So, having a URI scheme based on simply listing the minimal set
of distinguishing features of a resource, per *your* system,
seems to have an extraordinaryly high risk of colliding with
other uses of that same URI but with a different denotation.
Thus, your URI scheme, while logically valid, does not seem to
reflect the above best practice regarding the selection of URIs
to denote resources for the global interchange of knowledge
between arbitrary systems.

It also seems to hide knowledge about the resource in the URI
itself, rather than making it explicitly visible to agents
as statements about the resource denoted by the URI.

> > So if you tried to define a URI scheme that could
> > intentionally be used to provide for overloading of denotation,
> > I would consider that to be in conflict with the fundamental
> > S/Web architecture.
> 
> I think it would also be in fundamental conflict with web
> architecture, but ... I'm not sure how to phrase it for 2396bis.  My
> last attempt [2] was ignored by everyone.  :-)
> 
> Here's another attempt, changing the basic definition of a URI.
> Again, this goes somewhere near the beginning; exact glue can wait.
> 
>    Each URI is a string which conforms to URI syntax and which names
>    something.  The naming relationship gives the URI its primary
>    utility, allowing it to be transmitted in the place of something
>    else.  There is no restriction on what kind of thing (real,
>    imaginary, physical, conceptual, ...) can be named by a URI, but
>    all such things are called "resources".  
> 
>    For a URI to be used effectively, the parties using it in
>    communication often need to share a notion of which resource it
>    names, but this commonality of knowledge does not need to be
>    complete to be effective. In particular, human parties will tend to
>    associate considerable real-world knowledge with the named
>    resource, while software agents will simply maintain the facts
>    about it suitable to their purpose.  In many cases there will be
>    some ambiguity in communication using URIs because of incomplete
>    sharing of knowledge about what is named by each URI, but this
>    ambiguity can often be reduced as far as necessary.

Here you seem to be doing the same thing that Pat seems to be doing,
lumping together without distinction the agreement about the URI to 
resource mapping and agreement about the qualities of the resource
to which the URI refers.

So, while not all parties will be concerned with, nor even agree
about all knowledge globally asserted (somewhere) about the resource
in question, they still need to agree that they are talking about
the same thing -- i.e. that the URI in question refers to one and
only one thing and they agree what that thing is.

Humans can make and test such agreements. SW agents must simply
presume that such an agreement is in force and valid when it
merges two RDF graphs from different sources.


>    [ maybe say something about renaming and about indexical URIs like
>    my.example.com?  something about GoodURIs having a long-term
>    consensus of meanings.... ]
> 
>    In addition to serving as a name, each URIs can also serve as a
>    message, conveying information about the named resource.  The
>    language of the message is identified by the URI's scheme part, and
>    to use the information encoded into the URI, an agent must
>    recognize the scheme name and understand the corresponding
>    language.  A common case is for the URI text to convey the network
>    address of a server which can communicate authoritatively about the
>    resource.
> 
> Or something....   Do you agree with what is stated there, even if you
> disagree it has the best wording?

Well, if you are using the URI itself to convey knowledge about
the resource denoted, I would ideally hope to see as part of that
URI scheme some component which reflects a globally unambiguous
(non-overloaded) denotation. Perhaps adding a manditory component
to the URI scheme which is a UUID, to ensure that naming collisions
do not occur.

Patrick
Received on Monday, 5 May 2003 05:55:34 UTC