Re: Grounding in English (was Re: semantics status of RDF(S)) from pat hayes on 2001-04-06 (www-rdf-logic@w3.org from April 2001)

From: pat hayes <phayes@ai.uwf.edu>
Date: Fri, 6 Apr 2001 13:32:50 -0500
To: Sandro Hawke <sandro@w3.org>
Cc: www-rdf-logic@w3.org
Message-Id: <v0421010bb6f3b8c43a57@[205.160.76.201]>
> > >I'm going to go out on a limb here and propose a solution that I think
> > >provides solid semantics without being unduly restrictive.  It's
> > >simple: reduce the RDF model to binary relations stated with
> > >locally-scoped terms which may be defined directly in English (not
> > >indirectly as URIs attached to semantics by various standards bodies
> > >and by application developers).
> >
> > Not sure I follow this, but English is not a good way to state
> > semantic meanings!
>
>Sorry for not being more clear.
>
>It seems to me that the only way two agents (eg you and I, or two
>computer processes) can communicate (at least electronically) is by
>exchanging linguistic expressions in a language they (we) both know.
>
>We can, of course, define a language (KIF, Prolog, FOPC, RDF, etc might
>qualify) and then use it.  But we have to define THAT language using a
>language we already know.

No, we don't!! This is a KEY POINT, and is the basic reason why we 
'logics types' keep going on and on and on about semantics.  Giving a 
translation into another language is NOT the only way to fix meaning. 
A much, much better way is to give rules for attaching the language 
to a world. That is what model theory (= logical semantics) does. 
Model theory does not translate a language into set theory: it 
describes (using set theory) what a 'world' has to be like and how 
the expressions of the language have to be interpreted in such a 
world, and what their truthvalues are when they are interpreted. The 
worlds may seem a bit wimpy (eg for DAML+OIL, a 'world' is just some 
set of things plus some subsets of it called classes and some 
relations on it called properties) but thats the point: the more 
'minimal' the assumptions made about the 'world', the more widely 
applicable the language is. Once that is done (and its not usually 
all that hard to do, though working out all its consequences takes 
some time), all the translation stops right there: you can determine 
whether or not something is a correct inference, for example, by 
checking that it preserves truth in any 'world'; you can tell what 
something says by checking out what kind of 'world' it is always true 
in, and so on.

>And it occurs to me that we end up back at English.  Usually English
>in research papers and textbooks and formal specifications and
>dictionaries, English written with an eye toward semantic precision,
>but still just English.

Personally, I NEVER end up at English, when specifying meanings. If 
any English is used, I know that the meaning is NOT properly 
specified. The use of natural language in a sementic specification is 
a hallmark of the job not being done properly. (There are whole 
libraries of stuff on this point in the AI literature, by the way.) 
So I couldnt disagree with you more.

>This gives us an interesting design option: we can make a knowledge
>exchange language with extremely simple syntax and extremely simple
>semantics that can be arbitrarily extended in a non-conflicting way by
>any user community.

We already have this, and it is called logic (or some subset thereof).

> > >More formally, in prolog syntax, the RDF model would be defined as
> > >having two relations:
> > >
> > >   binary_relation(subject_term, relation_term, value_term).
> > >
> > >which means that the relationship identified by the relation_term (by
> > >the mechanism defined below) is truly held between some object
> > >identified by the subject_term and some object identified by the
> > >value_term, respectively.
> > >
> > >The definitional grounding for the relation_term and optionally for
> > >the other terms would be provided by:
> > >
> > >   english_definition(term, "This is English text which defines 
>something").
> > >
> > >This approach allows semantics to be defined with arbitrary precision
> > >for humans by allowing the inclusion of entire textbooks or legal
> > >codes if necessary.
> >
> > But it doesnt provide any model-theoretic semantics, since English
> > doesnt have a model theory. The point is not to pin down the meaning
> > for English readers, but to provide a mathematically checkable notion
> > of valid inference for machines.
>
>So you and Peter can define a language (which is just a vocabulary,
>here) using my mechanism.  And then you can define another one.  And
>the two can coexist and cofunction in a variety of data stores.  And
>machines which have been taught one of the languages (ie which know to
>match particular English strings to their internal machine operations)
>can follow formal declarations in that language giving semantics to
>the other language so they can then understand both.
>
> > >And it allows machine processing via the
> > >crude-but-effective mechanism of exact matching of strings.
> >
> > The longer the prose gets, the harder it is to get exact string matching.
>
>I don't see computers having a real problem comparing arbitrary length
>byte strings reproducably.  But no, I wouldn't ask people to type the
>stuff.  For that, I'd use a level of indirection, like a web address
>for the text.  But the level of indirection would of course be
>formally specified in my proposed protocol to the mechanism which
>accepted the user input.
>
>(My main reason for not having that web-address indirection in from
>the start is that web page contents can change, and for security and trust
>reasons I thinks its essential to know certain semantics are immutable.)

What you are calling 'semantics' is just using a very long reference 
number. The machines doing the matching have no idea what the English 
is saying, or even that the strings are in English.

Eons ago, IBM used to give a quiz to budding programmers, and one 
trick question was, what is the worst thing to do to a punched card? 
OPtions includd things like re-punch it, fold it, tear it, get it 
wet, etc., and the answer was, write on it. Because that set up a 
situation where the human understood something which the machine 
couldn't read. You are proposing having blank punched cards with 
writing on them, and calling it 'semantics'.

Pat Hayes

---------------------------------------------------------------------
IHMC					(850)434 8903   home
40 South Alcaniz St.			(850)202 4416   office
Pensacola,  FL 32501			(850)202 4440   fax
phayes@ai.uwf.edu 
http://www.coginst.uwf.edu/~phayes
Received on Friday, 6 April 2001 14:31:07 UTC