Re: universal languages from Stefan Decker on 2001-02-06 (www-rdf-logic@w3.org from February 2001)

From: Stefan Decker <stefan@db.stanford.edu>
Date: Tue, 06 Feb 2001 09:39:44 -0800
To: pat hayes <phayes@ai.uwf.edu>
Cc: www-rdf-logic@w3.org
Message-Id: <5.0.2.1.2.20010206092603.0263e670@db.stanford.edu>
At 10:14 PM 2/5/2001 -0600, you wrote:
>> > What I don't see the reasoning behind is the insistence that
>> > URI's should be used to name *everything*.
>>
>>So what's the alternative: string literals?
>
>The alternative is, in general, unicode character strings, or numerals, or 
>whatever else I and the people I am communicating with might want to use 
>to refer to anything. Hell, it might be little movie gifs one day for all 
>I know. Why should this be restricted to some predetermined syntax?

Two hypotheses:

(a) Global unique object identifiers are necessary for the
      context-free representation and exchange of data on the web.
(b) URIs are a  (not necessarily perfect and well defined) way to construct
     global unique object identifiers.

"Unique" just means, that an identifier identifies one entity,
not that the entity can't have multiple identifiers.
Without unique object identifiers there is no way to
construct a "web" of interconnected data.

An example:
The following pages
(delivered by Google http://www.google.com/search?q=Pat+Hayes )
talk about a Pat Hayes.

1) http://ourworld.compuserve.com/homepages/lamontcranston/patlamon.htm
2) http://www.coginst.uwf.edu/~phayes/
3) http://realtors.mls.ca/remaxoa/agents/rmx0182.html
4) http://www.harborne-cc.co.uk/index.html?players/hayes.htm&2
5) 
http://www.clarelibrary.ie/eolas/cominfo/democracy/councillors/cllr_pat_hayes.htm
6) http://www.mpsi.net/users/dcnpat/main.htm
7) http://www.cise.ufl.edu/~ddd/FLAIRS/FLAIRS-96/hayes.htm
8) http://www.sunysb.edu/stuaff/advocate/spring00/advoc-4.htm
9) http://www.bu.edu/track/NewEngland/indoor/2000indoorresults.html
10) http://www.imprint.co.uk/online/salt.html
11) http://www.rounder.com/rounder/catalog/bylabel/atm/1119/1119.html

Lets assume, that information contained in these pages
are available in some formal language, so that an agent is able
reason  about it. Lets further assume that the authors of the pages
have not used a method to construct a globally unique ID for their version
of Pat Hayes.

How would an automated agent find out, that pages 2,7 and 10 talk
about the same Pat Hayes?
(a) The agent can't assume, that the OIDs used in the
     pages are mutually disjoint.
     To correctly reason about the available information
     the agent needs to assign a specific, in his context unique
     object identifier to each of the Pat Hayes mentioned in the pages,
     since it needs (internally) a way to distinguish the different entities.

     Lets say it uses "pat_hayes_1" for the Pat Hayes on the first page,
     "pat_hayes_2" on the second page, etc. So object identifiers
     are necessary at least internally.

(b) Then the agent would compare the properties of the different
     Pat Hayes entities and (e.g.) compute probability measures for the
     identities of the different object identifiers based on their
     properties).

Finally it would arrive at "pat_hayes_2" = "pat_hayes_7" = "pat_hayes_10".
How would it be possible to again publish this result on the web?
The OIDs "pat_hayes_2", "pat_hayes_7" and "pat_hayes_10" are
just unique in the internal context of the agent. Nobody else would know their
meaning, thus it doesn't make much sense to publish the result without
a globally unique OID for each of the Pat Hayes entities.

Furthermore it would have mean useful, if pages 7 and 10 would
have used the same OID as page 2 - then effort the agent
had to perform would be unnecessary in the first place.

Disclaimer: I'am not saying that the OID should point to anything
retrievable. All that is necessary is the ability to construct
a global unique object identifier for a given entity.
URIs seem to be one way to do this, however, certainly extensions
are necessary.

Best,

         Stefan





>>But there is no control over
>>string literals, there is no agremment on what "my uncles left thumb" is.
>
>Of course not. There is no agreement about what anything accessible 
>through a URI means, either. Any such agreement must, in the end, be a 
>social agreement within a community to use a name in a certain way - it 
>cannot be 'controlled' -  but what we can do is post our assumptions in a 
>publicly readable form in a publicly accessible place (which is indeed 
>where URI's are a natural tool to use.)
>
>>But if you give that a URI, you are asserting a particular context for
>>which that can be used. In other words, if I use that URI, you can define
>>it as meaning your uncles left thumb
>
>No no, you can't have it both ways. HOW do I define it to mean my uncle's 
>left thumb? By putting something at the location indicated by the URI 
>which identifies my uncle's left thumb? But why don't I use that (whatever 
>it is) directly and save you the trouble of fetching a URI? You arent 
>going to have learned anything new about what something means just by 
>forcing it to be sent to you by optical fiber, and neither is your 
>software. If I want to use a URI to specify a particular context (whatever 
>that means), then fine. But I see no reason why anyone should be *forced* 
>to do this.
>
>>, but only in the context of whatever
>>tools you are using that namespace with. I think there is some confusion
>>between machine processability and natural language here: there are
>>*limits* to what machine processing can do.
>
>Believe me, I am vividly aware of what those limits are. I've been working 
>in AI for 25 years. But machines can draw conclusions from axioms which do 
>not use URI's, for sure.
>
>>The whole point for using URIs
>>is that they are decentralized; anyone can set one up. While it is true
>>that you can't use URIs to represent everything, you can use them to name
>>anything which is namable.
>
>Well, that is either trivial or false. You can, of course, put a name at a 
>location with a URI and then use that. So in that sense URI's are 
>universal, but that's a trivial sense, In that sense every piece of paper 
>is in an envelope because you could put it in one.  But it is not true 
>that every possible *name* is a URI. My name is not a URI. "Boston" is not 
>a URI, and neither is "4,367".
>
>Pat Hayes
>
>---------------------------------------------------------------------
>IHMC                                    (850)434 8903   home
>40 South Alcaniz St.                    (850)202 4416   office
>Pensacola,  FL 32501                    (850)202 4440   fax
>phayes@ai.uwf.edu http://www.coginst.uwf.edu/~phayes
Received on Tuesday, 6 February 2001 12:40:25 UTC