- From: pat hayes <phayes@ai.uwf.edu>
- Date: Thu, 13 Mar 2003 17:16:02 -0600
- To: Tim Berners-Lee <timbl@w3.org>
- Cc: Public W3C <www-archive@w3.org>
>On Tuesday, Mar 4, 2003, at 12:40 US/Eastern, pat hayes wrote:
>[...]
>
>>>>
>>>>?What is a definition?
>>>
>>>A definition is a text which describes the meaning of a term..
>>
>>Describes to who or to what? You have to say who (or what) is
>>expected to be able to read the text and extract the meaning. All
>>this debate turns on the issue of having 'texts' readable by
>>software, and what the limits of this are defined to be. Software
>>can't read English text, but it can read and use RDF/S/OWL text.
>>
>
>No, this debate does *NOT* turn on the software reading the specs.
If specs count as definitions, then it does. But who mentioned
specs? I asked you to define 'definition', and you started talking
about texts, and I asked you who or what is reading the text, and you
refer to specs. I feel like the ground is shifting as we talk.
The semantic web depends - MUST depend - on having *meanings*
processed by software. I think this is the basis of the mismatch in
communication we are having over this issue. Read on for details.
>To situate RDF and OWL within the real world, you rely on particular
>vocabularies
>which are processed by software which has been written by a human being who
>has read the spec.
True. But notice, the writer of the software has read the spec of the
formalism - in our case, RDF, say - NOT the spec, if there is one,
which defines the meaning of the particular vocabulary. He can't
possibly have read that, in general, because the software he writes
has to be able to handle vocabularies written to conform to
specifications of meaning which have not yet been written or even
thought of when he is writing the software. The RDF spec does not
define the meaning of every RDF vocabulary. It would be useless if it
did. Contrast this with, say, a Java spec, which does in a sense
define the meaning of every Java program, at least to the extent that
a Java interpreter written to the spec is capable of running any Java
program that will ever get written.
You keep repeating this, that software is written by people who read
specs. It is of course true, but it is beside the point here. The
SWeb puts us in a position (familiar to AI programmers) where we need
to write software which ITSELF will be processing formal
representations of meaning. When writing CWM, you weren't able to
access the specs of the URIs in all the RDF that CWM might ever
process. At that point, the spec which determines the meaning, and
the programmer who writes the code, have an impassable barrier
between them. The software has to operate on its own.
> Open Financial Exchange bank statements are generate inside
>a bank by a programmer who has read the OFX spec and understands what the
>fields mean. On the consumer's side, on your desktop, the document is read by
>software (such as Quicken) which was written by people who read or
>even wrote the
>OFX spec. (OFX is actually a SGML application but it could be an
>RDF application
>and the RDF mapping is straightforward) If you as a use RDF rules to
>make a USA income tax return from that data, then you generate info in
>in a vocabulary of IRS Form1040 line numbers.
Maybe so, but the example seems to illustrate my point rather than
yours: the RDF spec doesnt say anything about the meaning of this
vocabulary. And there is nothing in the spec of RDF which says, in
fact, that an RDF vocabulary need even have a spec, or be defined
anywhere by anyone. Its just a vocabulary which someone is using to
say something about something. RDF can be used by anyone to say
anything about anything, right? There are ontologies out there
already which talk about all kinds of things, eg take a look at the
emerging DAML-time ontology (not yet put into DAML). There's lots of
English in the document, but nobody is going to claim that the
meaning of the English is part of the meaning *of the axioms*.
>The meaning of these fields are defined in a human-readable document which is
>online.
In your example; but not in general. And even if it is, it may have
no obvious connection to the actual base URI. For example, take RDF.
If you follow the base URI of any piece of the rdf: namespace, eg
rdf:type, you get to a page full of, guess what, RDF. You don't get
to the RDF spec, which is another document altogether. Similarly for
DAML, by the way.
> The rules which you might use to classify your spending and income
>maybe written in OWL and the data may be processed by an inference machine.
>
>Specification-wise, the inference engine is only authorized to make those
>inferences because the OWL predicates, when looked up, point to the
>OWL spec. The inference engine was not written by a computer program,
>it was written by a person who read the english OWL spec.
Of course, but this doesn't say very much. The point at issue here is
saying what the meaning of a particular OWL vocabulary is. Most OWL
vocabularies won't have any particular link to any spec at all,
machine-readable or otherwise.
>>>
>>>For example, http://www.w3.org/TR/1999/REC-rdf-syntax-19990222/#property
>>>is (part of) an english definition of a rdf:Property.
>>>
>>>I know its a bit ofashock, but we are not being formal here.
>>
>>I know we aren't being formal, but we do need to be more precise.
>>One can be precise using English, believe it or not.
>>
>>>This definitoin
>>>of property is a rough and ready english definition. But we get by with it
>>>and the rest of the stuff about predicates in the document.
>>
>>Believe me, there are a lot of people in the various WGs who do NOT
>>get by with this rough and ready a definition. Hence, I suspect,
>>many of the communication problems we are having here.
>
>Well, the english definition of the property is what we have. It
>includes reference to some
>mathematical stuff in the spec,but i'd note that the model theory is
>invoked by the
>english, when you look at how this spec gets its authority, not the
>other way around.
That states a position in what is basically a debate within the
philosophy of language. Myself, I don't agree with it; very few
working mathematicians would, I suspect, and very few programmers
would agree that the meaning of a programming language was rooted in
the English of the manual rather than, say, the code of the
interpreter. But in any case, its by no means obvious or even
plausible as an account of where 'authority' arises here. Many
philosophers would say that English meaning is rooted in a kind of
social compact between cognitive agents, and many cognitive
scientists (though not all) would go on to base that compact, in
turn, in a presumed internal representation inside those agent's
heads, and some (though not all) would say that those internal
representations' relationships to the world they describe could be
understood using model-theoretic terms. You can go around and around
this circle, and its not clear where, if anywhere, you bottom out in
an 'authority' or a 'grounding'. The truth is probably more that
meaning arises as a complex system phenomenon, and different
theorists can find it's 'origin' almost anywhere in the system, which
tends to be where their particular discipline stops looking for
explanations. I know some extremely clever folk who are sure that
meaning is rooted, ultimately, in how the equations of quantum
gravity affect the chemistry of microtubules inside neurons.
By the way, your dismissive phrase 'mathematical stuff' isn't likely
to endear your way of thinking to the technicians who are actually
getting your semantic web built. You really ought to use a nicer tone
when talking about the workmen, particularly when they might be
listening. And it misses the essential point that 'mathematical
stuff' can be about non-mathematical things. That is usually why
mathematics is so useful, in fact: its just a very precise language
for talking about whatever you are talking about, its not a geekish
jargon for saying nothing in particular. God alone knows, this entire
discussion could do with a little precision.
>>>>
>>>>>of the predicate, as applying to the subject and object identified by
>>>>
>>>>
>>>>?How do the subject and object identify things?
>>>
>>>Um.. by using a URI, where sender and receiver share
>>>information
>>
>>And if the sender and receiver are software, what does it mean to
>>say that they share information?
>
>For formal stuff, there is a mapping from ?x to ?x.log:semantics
>which is common (modulo real world
>effects such as power outages, lying and cheating, etc) to both users.
I do not follow your meaning here. What are ?x and ?x.log:semantics?
And what has this got to do with the question I asked?
>
>For most software today, such as Quicken or Microsoft Money or Apple
>iCal, the spec
>has been read by a programmer who made the code.
You keep saying things like this, and I keep thinking that it is
entirely irrelevant. We seem to not be communicating at some very
basic level.
Look, code is written by people, lets agree. And people who write the
code have read and understood the specs which define the programming
languages they are writing the code in. OK, all agreed. Now, this all
has absolutely nothing to do with what we are talking about. RDF and
OWL are not programming languages, and understanding the semantics of
the implementation language of an OWL inference engine does not
contribute one jot or tittle to saying anything about the meaning of
the OWL being processed by that engine. Even a spec which says
something about the assumed meaning of something that might appear in
an OWL vocabulary and get input to an OWL engine cannot have any
relevance to what that engine does, since there is no requirement of
the writer of the OWL engine to have read every spec of every
information source of any vocabulary that the engine might ever have
to deal with.
>Actually, this applies to OWL inference engines. They can be
>considered, from the web point of view,
>just RDF systems which have been programmed to have an inherent
>knowledge of the
>meaning of certain terms. This allows them to do certain things on my behalf.
Yes, of course. I presume by certain terms you mean those in the OWL
namespace. But that is beside the point in this discussion. The
question is not what the OWL:... URIrefs mean: we have that part
reasonably well locked down. Its what all the OTHER URIrefs in an OWL
document mean, the URIrefs in the subject position of the content
triples, all those facts about whatever that are being
OWL-manipulated. Neither the OWL programmer not the OWL inference
engine is in a position to be able to follow those links and read and
understand what it finds there. The programmer can't see those links,
since he or she is long gone when the stuff is actually running, and
the OWL engine can't read English.
>>> which restricts
>>
>>restricts how??
>>
>>>the assication of the URI to one thing
>>>(or one thing withiin a given shared context).
>>>I am not sure what level of answer you are looking for here.
>>
>>Well, even at the rough and ready level, it would be good to have
>>some general guidelines which say how to use URIrefs to refer to
>>things. Seems to me there aren't any right now (except for URNs).
>>If a URIref is a URL then I can use it to locate a web page, or a
>>document if you like. Now, how do I use that document to locate the
>>referent? Are there any rules or guidelines about that, of any kind?
>
>Well, my Stack article outlines them.
Pointer?
> They are in fact defined by a rather large pile of specs.
No, they CANT POSSIBLY be determined that way. Think about it:
suppose you had to explain to a Martian how English personal names
worked. You would have to talk about the *processes by which things
get named*: baptism, christening, registration of births, etc..,
legal conventions regarding name uses, stuff like that. Giving a
spec of what a name 'is' wouldn't do the job. {Later: see below for
qualifying comment.]
>RDF's job is only to be a relatively simple link in the chain.
>
>They haven't been written up formally,
I havn't seen them written up in any way at all, formally or
otherwise. Ive read phrases like 'URIs are universal identifiers'
many times, but that's just froth: that doesn't tell me HOW things
get baptised with URIrefs.
> but in the special case of an RDF document,
>the RDF spec should tell you what an RDF document tells you about
>the referent.
What is doesn't tell you, and what no MT or language spec alone can
possibly tell you, is how to determine what the referent IS. So why
is what it says relevant, if you have no idea what it is saying it
about?
[Later: after reading your piece on HTTP URIs, that provides a nice
illustration. If we say that these URIs are names for the documents
you get by using the HTTP protocol on them, then this would be a very
good 'extra-logical' way of saying what one class of names means and
how to attach a name to a certain kind of thing. And I guess one
could say that this is rooted in a pile of specs, but they are specs
about how to actually USE these names to get at something. They are
all about transfer protocols. Now, you can't transfer a car or a
galaxy by using a protocol; so what kind of specs are going to
provide ways of naming solid, non-informational things like that?]
>>>Statements which restrct interpretations such that
>>>within the domain of discourse, for any intepretation,
>>>any things identified by the URI are equivalent?
>>
>>Well, that assumes that this is somehow done within a model theory,
>>which would be nice, but has its limitations. There are some
>>general results limiting the extent to which MTs can possibly
>>restrict or impose referents: eg the Herbrand results show that any
>>consistent set of statements CAN be interpreted so that the names
>>all refer to themselves, so one can make an interpretation entirely
>>out of symbols. For reasons like this, one usually expects that a
>>theory of reference - of naming - requires something additional and
>>external to the MT in order to 'ground' names in the actual world.
>
>Well, the MT, while central for you, is in fact a rather peripheral
>(though indeed useful) part
>of all this.
I don't think Im being merely turf-defending if I beg to differ.
Formal semantics aren't just a formal aside: they represent about 60
years of intensive investigation into the basic semantics of
languages. Not just 'formal' languages, but ALL languages, ranging
now as far as things like diagrammatic notations, cartographic
conventions, programming languages, natural languages and all the
notations of logic and mathematics (which includes all of science and
engineering). They are about as central as you can get. Without a
precise semantic theory you don't have squat, you are just making
noises. Once you start specifying formalisms intended to express
propositions, an MT (or some kind of formal semantic specification)
becomes absolutely central, because it doesn't just tell you what is
represented: it also tells you what is NOT represented. If you don't
want it to do that, you really shouldn't have let the WG's publish an
MT at all.
You seem to want it both ways: a nice consistent web of interacting
software, but also a nice blurry Humanist sense of open-ended
Meaning. Until we can make human-level AI engines, you have to choose
one or the other.
>The RDF spec should tell me to go look in the spec of the predicate
>to find out what p owl:inverse q means.
It shouldn't do that for several reasons. First, most predicates will
not have a spec. Second, *your* ability to read the spec, even if it
exists, is irrelevant if you are not around when the RDF is being
processed, and I presume the whole point of the SWeb is that you will
not be around, but that software will be doing it for you. And third,
even if you are around, and there actually is a spec for you to read,
what relevance or utility does that have to someone who is writing an
RDF processor? The RDF engine isn't going to even know what *you*
think about what the spec says.
>I go and look up the owl:inverse, get eth OWL spec.
Actually you don't, see above. Not that it matters.
>It is a bunch of english and RDF which refer to each other, and the
>english bits
>refer to the MT. A programmer can then write code
But that is beside the point. Look, the whole idea here is to write
RDF-handling code, right? I am talking about what that code is
supposed to do. Thats what the RDF spec ought to do, to provide a
guide for writers of that software. If I hear what you are telling
me, the software ought to follow the subject links and read the specs
it finds there. Since this is written in English, the poor dumb
software presumably has to ask some human to read it, and wait for
some input to decide what to do. This is bad enough, but now you are
telling me that the RDF-handling code has to wait for those humans to
write some other code? How fast is this whole thing supposed to run,
for God's sake?
>which will output
>a p b given b q a
>because he understood the spec with the help of the model theory.
No, the job of the RDF programmer is to read the model theory and the
rest of the spec, write a reasoning engine which performs valid RDF
inferences (let us suppose) according to the MT, then to go away and
let *that code that he has written* alone to process RDF at
electronic speeds. God alone knows what the RDF it processes is going
to be about or what the URIrefs in it refer to, but whatever it is
about, he - the writer of the code of the RDF engine - has no way to
read any of *that*. When the URIrefs in some RDF arrive down the
optic fiber and get processed by the RDF engine, the programmer is
OUT OF THE PICTURE. The RDF engine is alone with the RDF. Now, who is
going to read those property URI specs, again?
>That's the state of the art. When we can pick up (in the RDF bit)
>axioms for the new terms,
>then the state of the art will advance a bit, and some of the
>functionality of the OWl
>terms will be directly loadable by a more general reasoner.
>
>>
>>>>
>>>>Neither of these are easy questions to answer and neither of them
>>>>has an answer in the current spec.
>>>
>>>No, that's good, because the questoin of what is identified by a URI
>>>is dealt with in URI spec and associated specs.
>>
>>Not in any Ive read.
>
>Clearly not to your liking, and not really to mine, or we wouldn't have
>a bunch of TAG issues on the subject. But the URI specs are the documents
>which define that, in as much (a) you can understand what they mean
>and (b) your philosophy allows for specifications at all of anything.
I have read them as English, with an open mind, honestly trying to
get something useful out of them, several times. I havn't set out to
treat them like a philosophy essay, or to critique them. But they
just don't work: they don't say what needs to be said, they
contradict themselves (in the only way I can possibly understand the
words in them) and they don't specify their meanings adequately. I
*still* , after several years immersion in W3C business, have
absolutely no idea what anyone means by "resource". I strongly
suspect that everyone means something different.
>>>The only question that RDF has to answer (not as part of
>>>itself, but as part of a duty delegated from the URI spec) is to
>>>show how, when the URI
>>>is an identifier within an rdf document (a la foo.rdf#bar), to
>>>show how RDF allows
>>>the set of things which a URI might by identified to be restricted by
>>>RDF statements about that thing, or as we say in english, how RDF
>>>documents can describe things.
>>
>>OK, that is fine with me. But that way of saying things treats URIs
>>as logical constants (ie things that denote whatever the logical
>>constraints force them to denote ) rather than names (ie things
>>that *have* a denotation to which they just *do* refer, attached to
>>them by some extra-logical means, like "Patrick John Hayes" refers
>>to me because it says so on my birth certificate.)
>
>Whether something is a constant or a name then depends on whether
>the identification is "extra-logical", which
>depends on the boundary you throw around some stuff you call your logic.
That makes it sound like an empty terminological issue, but that gets
it backward. Its extra-logical because it deals with issues that
logic isn't designed to handle. I cast logic as wide as it can go,
but I know that it has its limits. Specifying referents is beyond the
competence of a formal logic. Formal logics were designed to analyze
some aspects of human language, essentially, and they have been very
successful at that analysis. But language has other aspects, and
reference and naming is just elsewhere in the intellectual landscape.
(This isnt an original thought, by the way.)
>For the OWL MT by itself they are just names, while for the whole
>semantic web they would
>be more like constants?
More like the other way round. There is more to a 'proper' name than
its logical role as a constant. You can treat it as a logical
constant in the logic, and that is fine: but that fails to capture
something essential about it. As far as the logic is concerned, for
example, you can interpret any constant as denoting itself. But a
proper name really CAN'T be interpreted that way: it comes with a
fixed referent. "Patrick John Hayes" really is *not* the name of a
character string.
>>The thing is, Im sure that most actual uses of URIrefs are more
>>like names than like logical constants, in fact; but we don't have
>>any rules for specifying how these names get their referents
>>attached to them. (For example, does a URL *denote* the web page
>>you get by using the http protocol on the URL? Some people assume
>>it does, others make assumptions which are incompatible with that,
>>eg Euler. There are no rules for this in the URI documents, which
>>aren't worded with enough precision to even make the relevant
>>distinctions.)
>
>Voila. Hence the TAG issue. My own view is clear.
>http://www.w3.org/Designissues/HTTP-URI
I tend to agree with you about HTTP URIs. It makes sense to assume
that the denotation of an HTTP URI is the document-thingie you get
when you use the HTTP protocols on it. I would be quite happy if the
RDF MT were to mandate this, and then indeed we would have at least
one class of genuine names. (We would need to get down to a bit more
detail about individuation criteria for those thingies, so that OWL
could use cardinality reasoning on them, but that would be fun.) But
we still need a way to give names to things that aren't documents,
and most interesting things aren't.
>>In normal human society there are all kinds of such rules for
>>attaching referents to different kinds of names (baptism,
>>registration of a birth, marriage, ship naming ceremonies,
>>whatever) and associated knowledge about how to recognize a proper
>>name and how to access its referent, if you really want to get at
>>it.
>>
>
>Indeed. It is simpler on the web. It may be the first time one has been
>able to formalize the process.
Maybe, indeed. Interesting, and exciting! But I'd like to get into
the details a bit more. I want (and I think we will very soon, if not
already, NEED) a way to allow software to do the actual naming, or at
least to be able to be told about it. We need protocols for actually
doing Web baptism, or at least for adding new names to things that
already have names (which can be done entirely using the names: "I
hereby name the thing named "Patrick John Hayes" by the new name
"Arthur-26". " Or maybe "The name "mydatatype:octal" names the
rdf:Datatype which you can retrieve using this datatype API:...., )
BTW, this issue has already come up in the RDF spec, concerning
datatypes. How does one baptise a new datatype, so that an RDF engine
can be told about it? You 'give' it a URIref... but *HOW*?
>>>>>If my:car :color :blue means that my car is colored blue, that
>>>>>is what it means, quite independent of context.
>>>>>The concept of something having a given color is
>>>>>defined (and only defined) by the definition of color
>>>>
>>>>
>>>>Bad example, as color terms don't have definitions.
>>>
>>>They do. Casesium red is the sprectrum of an excited Caesium atom.
>>>Some are vauge -- red is a color which has predominatlylonger wavelengths.
>>
>>OK, true. But there is a sense of 'red' which can only be accessed
>>by people who have color vision. Philosophy rears its ugly head....
>>OK, leave aside philosophy (that's one problem with using words
>>like 'meaning') and we agree, it's fine to leave the definition of
>>'red' to, say, Pantone. The issue for us here is what it means to
>>say that Pantone 'defines' the 'meaning' of pantone:red35 , what
>>RDF(S/OWL) needs to know about that kind of definition, and whether
>>the RDF spec needs to say anything about that.
>
>The RDF spec only has to hand off authority to the pantone spec to
>define the actual relation
>identified by pantone:color.
There's a step missing. OK, Pantone has the authority to *define* the
name, say, pantone:red35. But something needs to be able to tell RDF
that this *is* a name, and not just an arbitrary URIref. And ideally,
RDF should have as part of its spec that when given a name, an RDF
interpretation is required to interpret that name as denoting the
thing it names (according the naming authority). That is in fact
critical to the intended use of RDF, as this kind of example shows,
but we can't say it at present in the RDF spec, because there isn't
any global notion of what a name IS. Its easy to say (using some of
that mathematical stuff that you disdain) but to do so requires one
to refer to the name of a thing. And right now, there is no general
way to refer to the Web name of a thing.
Heres another way to say it. I want to make that notion of
'authority' that you used, rigorous enough so that I can write model
theory equations about it, not just say things about it vaguely. I
don't think it is *conceptually* hard to do this, but it requires
something to be done at a very global level of organization for the
Web. Its not just an RDF or an OWL issue, its a whole-Web issue.
>>>>>and my:car only serves to idetify the car
>>>>
>>>>
>>>>How does a uriref identify a car? (Genuine question, not rhetorical :)
>>>
>>>Notionally, the URIref identifies the car so long as everyone who uses
>>>it does soconsistently with it identifying the car.
>>
>>But on the SW (for the first time?) 'everyone' has to be understood
>>as including software agents, and so this begs the question, since
>>we are left trying to figure out is how THEY can be said to
>>'identify' a car when given a URIref. I think the best we can do
>>here is to say that they can't actually do the identifying, but at
>>least they can be required to pass information around in a way that
>>doesn't screw up anyone else's identifications. Then allowing the
>>sofbots into the social fabric doesnt add anything really new in
>>the way of reference, but at least it causes no actual harm.
>>
>
>Yes, we're not looking to a bot (in general) for an inherent
>understanding of redness.
But we ought to be ready for when they DO start to have one. Or at
least when they start to contribute to that understanding in new
ways, eg in the case of red, by having color-meters more sensitive
than the human fovea linked to software agents which are making
decisions about paint purchases, things like that.
>(In fact we can build software which will tell you whether a picture
>of something is red.
>But we don't expect an OWL engine for example to be able to figure things out.
I bet the time wont be far off when people expect to be able to have
a reasoner figure out things which involve knowing that a color is
the kind of property which can be handled by a certain kind of
color-checking agent. Or that a car is the kind of thing that can be
identified by checking its chassis number (chassis numbers can be
used as names for cars).
>You could actually write a level-breaking rule which told you
>whether a one-pixel
>GIF was red by allowing log:contents and some string matching expressions.
Yuk.
>But that is very much a corner case!)
>
>>>Specifically and practically, the semantic web protocol is that
>>>a web page in RDF foo.rdf has a description of something
>>>of type Car and typically giving a country code and plate number
>>>as property values.
>>
>>Right, that kind of answer makes sense. Well, in one reading it
>>does, ie if 'description' means RDF/S/OWL/... description. Then the
>>formalism links the URIref to some other URIrefs which have
>>socially recognized status as 'grounded' in real things like cars.
>>That way of thinking is quite compatible with the formal MT for the
>>web logics, and neatly separates the MT-handleable issues from the
>>much scruffier notion of grounding.
>>
>>I think many of our communication problems over this issue come
>>from confusing that reading with the reading in which 'has a
>>description of' means a description in some language (eg English)
>>which a softbot cannot hope to do any reasoning with.
>
>We will work in a world in which robots work with a subset of the knowledge,
>but are better at handling that subset than we are.
Right. And the first thing they will be good at is doing stuff that
we can do, in fact, but much faster and larger-scale than we could do
it, and without getting bored out of their tiny skulls.
Pat
--
---------------------------------------------------------------------
IHMC (850)434 8903 or (650)494 3973 home
40 South Alcaniz St. (850)202 4416 office
Pensacola (850)202 4440 fax
FL 32501 (850)291 0667 cell
phayes@ai.uwf.edu http://www.coginst.uwf.edu/~phayes
s.pam@ai.uwf.edu for spam
Received on Thursday, 13 March 2003 18:16:07 UTC