RE: names, URIs and ontologies

Just a note.

http://mapping.usgs.gov/www/gnis/ is the site pointing to the US Government
official repository of domestic geographic names.

http://mapping.usgs.gov:8888/gnis/owa/GetDetail?tab=Y&id=617565 is the entry
for Boston, Ma.

While I believe you can have a conversation about Boston in a general sense
and not specific or refer to Boston when really meaning the people of Boston
or other things related to Boston, I believe Boston and other geographic
entities are fairly well defined.

Pat

-----Original Message-----
From: pat hayes [mailto:phayes@ai.uwf.edu]
Sent: Friday, October 27, 2000 6:37 PM
To: www-rdf-logic@w3.org
Subject: names, URIs and ontologies


I would like to (re)-raise an issue which came up at the DAML 
kick-off meeting. Since it questions one of the basic assumptions 
made by the W3C folk, this is rather like farting in church, but here 
goes. It concerns names and URIs.

There is a basic supposition that all DAML names - nay, all names in 
*any* web logic - must be URIs. Even my logical friends have told me 
that this is harmless, since the idea of introducing a 'new' name - 
one guaranteed distinct from any other name in use - is a familiar 
one in conventional logic, and the use of URI# is a mechanism to 
ensure that any new name is globally unique. In this sense, then, 
URI's provide a foolproof way to prevent accidental clashes of 
logical constants in a global environment. OK, but what about names 
which are *not* unique? For example, suppose someone wants to put 
some information about Boston into DAML- say, that Boston Common has 
five sides. Never mind whether or how DAML can say thing about 
numbers of sides; there is a more basic question: how can a web logic 
in which all names are URIs even *refer* to Boston? "Boston"  is not 
a URI. Well, one answer would be that if one can find a reference to 
Boston on the web somewhere, say on  http://www.boston.com/, then one 
could use something like http://www.boston.com/Names#Boston (if there 
were such a thing, and presumably there will be one day, etc.). The 
problem is that there are thousands of references to Boston on the 
web, and none of them have any particular claim to be the 'official' 
reference (and even if they did, there would be no way to enforce 
usage of that 'official' location for a name). So suppose some people 
refer to Boston using the above, but others refer to it using 
http://boston.citysearch.com/Boston/System/Index#boston; how is 
anyone to know they are referring to the same city? Another answer is 
that I can write "#Boston"; but this is even worse, since it merely 
introduces a new name which is guaranteed to be distinct from all 
other uses of "Boston". We are in the position of people who are 
trying to talk to each other, but every time one of them uses a name 
to refer to something, everyone else assumes that it means something 
entirely new. In effect, every use of a name introduces a new 
existential quantifier: instead of referring to Boston, I can only 
say : "something exists which I will now call "Boston".... ".

I can imagine all kinds of 'social' ways to deal with this. Maybe 
huge websites will be devoted entirely to establishing identities 
between various web-sites' usage of URIs, providing a kind of 
referential cross-checking service between URIs.   Maybe it will 
become routine for websites maintained by responsible folk to 
establish a reasonable number of identity cross-links to existing 
sites, and 'clusters' of mutually referring sites will enable 
identity of names to be routinely established in a reasonably small 
number of steps. Maybe some web communities will establish 'official' 
naming sites which become the standard reference for certain kinds of 
names, eg the post office for names of US communities, say. Maybe all 
of the above. But maybe also it would be possible to modify the 
current doctrine slightly, and allow a more natural use of names, 
more along the lines that they are used already throughout human 
society and probably have been ever since living creatures first 
evolved language.

Unlike logical constants, real names are *public*; they are used by a 
community to tell one another about things, by using the names to 
refer to the things. Logical 'names' - better called logical 
constants - are not names in this sense. They are like the 'names' 
which mathematicians use when they say things like "Let S be a 
set..."; they are temporary, used for the course of one proof and 
then abandoned for re-use, and in principle eliminable. In fact, many 
formal logics do not use constants at all. One can think of such a 
logical name as shorthand for an existential quantifier (something is 
known (or assumed) to exist, so let us call it "S" for now.) The 
logical restrictions on the use of constants (one must use a new name 
which has not appeared previously in the proof) are designed to 
protect the reasoner from accidentally conflating two of these 
existential assumptions and invalidly transferring properties from 
one to the other. All of this however is a very 'local' matter, and 
the real use of names does not come up in this world. I might use the 
name 'S' to mean one thing today and something else tomorrow, and you 
might use it at the same time as me. Web logic has to be different: 
we cannot prevent, and I do not think we should try to prevent, the 
public use of names. As long as people are using content scrapers on 
open text and trying to represent the content they find there, for 
example, the use of names in the public sense is inevitable, since 
this is how names are used in language. They are not logical 
constants!

The uniqueness of the URI constraint means, in effect, that all names 
on all web pages must be treated by DAML as though all public content 
had been eliminated. In logical terms, it 'reads' every identifier as 
being existentially quantified by a quantifier *on that very page*. 
So when the Boston Globe uses the name "Boston" and CitySearch uses 
the same name, the identity of these names is to be ignored, and DAML 
in effect treats these two pages as each making a seperate 
existential claim, which they happen to have Skolemised using the 
same logical constant name. But DAML treats this as an accident: they 
might have well have used different strings, since DAML treats any 
name on one page as distinct from a name on another page. If all 
names are attached to URI's, then it is impossible for the logic to 
even express a public name: there is no social assumption about name 
usage. This is surely a terrible error. It is worse than the tower of 
Babel: it assumes that we all mean different things every time we use 
a word. Why not allow public names, and let our reasoners take the 
same risks that we all take every time we speak?

Suppose that DAML allowed, but did not require, that names be URIs. 
Then I could refer to Boston and you could refer to Boston, and we 
would be using the same name; and indeed it would be the same name 
that is used already on the webpages already referred to (which 
contain not a shred of DAML but lots of handy information about 
Boston.)  Of course there are potential dangers.  It might be that 
your use of "Boston" and mine might be only a lexicographic accident, 
and I was referring to the American city while you were referring to 
the small town in south-east England from which the name of my city 
was originally derived. If so, we might initially misunderstand each 
other; and this kind of misunderstanding is always a possibility 
which a reasoner must be prepared for when using 'public' names like 
this.  On the other hand, the fact that in the ordinary course of 
human affairs we find it much more convenient to use names 
unprotected by a kind of 'local use' label (I don't usually say 
something like "Boston in the sense that Pat uses it", I just say 
"Boston") suggests that in practice, the advantages might outweigh 
the disadvantages. And the fact that such public names can be clearly 
distinguished from URI's means that this modification need take 
nothing away from the security and confidence in naming which the use 
of URI's may provide; one would use public names at ones own risk, 
and any use of a public name in a proof could be detected immediately 
and the conclusion marked as potentially suspicious, if one were 
anxious about any such misunderstandings. (Indeed, any public name 
could be referred to using the URI for the place it occurs, so that 
one might conduct in 'strict-naming' DAML a discussion whose purpose 
is to resolve an ambiguity arising from public-name use.) 
Nevertheless I bet that they would get used, at least by the great 
multitudes of web citizens, even if less so by the somewhat paranoid 
B2B and defense communities. (Even in law, public names often have 
considerable force, eg real-estate deeds often refer to 'the dwelling 
known as ....') Public web names would probably evolve their own kind 
of social use. For example, clusters of mutual reference could be 
used to establish agreement between usages (my sense of "Boston" is 
the same as these:...) but these could include for example reference 
to uses in non-DAML sources, including plain text. The result is a 
messier (for the logician) but socially richer and more robust kind 
of linkage of names to their use, and one that is also likely to 
provide a more useful bridge between the 'ordinary' use of the web 
and the 'semantic web' of the near future. And speaking as a 
logician, I find the new messy complexities more interesting than 
disturbing. For example, one way to understand the use of public 
names in logical terms is that such use amounts to a kind of 
agreement to a shared existential quantifier whose scope extends 
beyond one local ontology to encompass an entire community of 
ontologies. The resulting picture of logical structure transcending 
the usual lexical boundaries might provide the beginnings of a new 
way to conceptualise the 'logic of the web'. But until we allow 
people to use names in a less restrictive mode, we cannot even get 
started on this new enterprise: we have forced DAML names to be 
mathematical identifiers rather than real names.

Allowing public names seems to me to be a no-brainer.  They will be 
extremely useful and they will not disturb or interfere with anyone 
who doesn't want to use them. They will provide a way for genuine 
social uses of 'web logic' to evolve naturally. I can see no cost to 
allowing them, other than that they violate some kind of doctrine. 
Arbitrary doctrines which make my life more complicated and awkward 
than it needs to be are a prime target for being questioned and, if 
necessary, disobeyed.

Anyway, I'd be interested in any comments.

Pat Hayes
---------------------------------------------------------------------
IHMC					(850)434 8903   home
40 South Alcaniz St.			(850)202 4416   office
Pensacola,  FL 32501			(850)202 4440   fax
phayes@ai.uwf.edu 
http://www.coginst.uwf.edu/~phayes

Received on Monday, 30 October 2000 18:10:44 UTC