Re: universal languages from pat hayes on 2001-02-05 (www-rdf-logic@w3.org from February 2001)

From: pat hayes <phayes@ai.uwf.edu>
Date: Mon, 5 Feb 2001 11:10:26 -0800
To: "Tim Berners-Lee" <timbl@w3.org>
Cc: "Drew McDermott" <drew.mcdermott@yale.edu>, <connolly@w3.org>, <www-rdf-logic@w3.org>, <pfps@research.bell-labs.com>, <danbri@w3.org>, <horrocks@cs.man.ac.uk>
Message-Id: <v04210105b6a3cd679d2b@[205.160.76.250]>
Drew McDermott:

> > There are several important issues in designing DAML, but they seem to
> > me to be mainly semantic and computational, not syntactic.

Tim B-L:

>Yes.  But there is a parallel discussion of syntax.
>Those langauges which map into RDF triples  take advantage
>of the separation of syntax and semantics which the triples
>were designed to enable.

[Pat Hayes:]
Unfortunately this separation seems very elusive. In fact the members 
of the RDF committee seem to disagree about where exactly it occurs. 
Can you point me (us) to a reasonably precise summary of what this 
separation amounts to?

> > [Drew] There has
> > been a lot of work on representation systems over the years, so that
> > by now we at least understand the tradeoffs involved.  In essence:
> > the more expressive the language the less tractable is its use.
> > Peter's mention of Montague was an ironic mention of one end of the
> > continuum.  Typed lambda calculus are at approximately the same
> > point.  (Montague can be considered to be lambda calculus applied to
> > representation of natural language.)
> >
> > Anyway, the main questions would seem to be: How much expressivity do
> > we need?
>
[Tim]

>Enough to be able to express anything in any of the applications which
>will use the web.  So, we would want anything (data, rule, query...)
>from Peter's system or Ian's system to be expressable in the universal
>language.
>This of course means that the language will be powerful enough to
>ask uncomputable questions.
[Pat]
Yes, but that is not the central problem. That is Turing 
undecideability, but the original Goedel problem cuts deeper: if you 
have 'reasonable' expressive powers in a descriptive logic (hard to 
define exactly in full generality, but basically if you combine 
recursion with quantification) then if you also give it a truth 
predicate (make it able to describe truth a wide enough range of 
languages to include itself), then it will be *incoherent*. It will 
generate paradoxes all by itself.  So there are no 'universal' 
languages. This is now a very basic and well-investigated idea, about 
as well-established and thorougly understood as, say, the second law 
of thermodynamics.
(Maybe it would help to explain the occassional note of exasperation 
which comes through these discussions if I made an analogy. It is one 
thing to watch junkyard wars, and even to admire the pragmatic 
engineering abilities of the contestants. But if they start saying 
they are trying to make a perpetual-motion machine, you tend to stop 
taking them seriously. Sorry if I am being rude.)

> But we do not ask to be able to
>export a query which A's system can answer and have B's system
>answer it.  We do export a subset of the infromation from one system
>to another.  We do rqeuire that if A's system infers something it can
>write a proof in the universal langauge and B will be able to *check*
>(not find) the proof.

If they are both using a universal language, the checking is simple: 
don't bother, because all conclusions are valid. You don't need the 
proof.

More helpfully. Forget about 'universal' for a second; then the first 
question I would ask is, what are these proofs *about*? Because you 
will almost certainly need different ontologies (even if they are 
expressed in the same basic logical language) in different cases. 
The choice of language isnt likely to be centrally important for what 
you want, it seems to me, as long as it has what might be called a 
basic expressive ability (Some version of FOL rather than RDF, say). 
You can do the rest with ontologies expressed in that language. And 
since you don't apparently care about proof generation and are not 
interested in automatic use of onotologies by software, why don't you 
just choose some dialect of FOL or HOL and use that? Why be so fussy? 
It seems like you have thrown away all the criteria which would lead 
anyone to pick one logic or inference system over another, so what 
*are* your guidelines? For example, you and Dan C. seem extremely 
interested in higher-order type theories of one kind or another. Why? 
What do you think they will give you that a first-order language will 
not give you?

> > What sorts of inferences do we need?
>
>I am not an expert, so excuse any misuse of terms.
>
>There are many related logics, which use different sets of
>axioms and inference rules, but have the (provable) property
>that there is a mapping from formulae in one to formulae in
>another such that one engine cannot infer something
>which mapped across it false in the other system.
>
>The universal language defines this set of logics, if you like.
>It doesn't define which set of axioms any engine should use.
>It doesn't define which set of inference rules.

But Tim, this (literally) doesnt make sense. A specification of a 
logical language *is* a specification of a set of axioms and 
inference rules (or proof-constuction rules, more generally). So what 
do you mean by talking about a language without a language? That's 
like saying you want a universal human interligua without a grammar.

>It does require that whatever you do use must be consistent
>with some standard set.

1. 'consistent with' is a semantic notion, so how are you going to 
check it mechanically?
2. In any case, to do so you would have to somehow provide a 
translation between the two languages. The question of mutual 
consistency doesnt even arise until we are in a common linguistic 
framework.

>So all expressions of FOPC for example are basically
>equivalent.

Yes and no. That is a potentially very misleading statement, a bit 
like saying that all programming langauges are 'basically equivalent' 
because they all compute Turing computability. There are many, 
probably hundreds, of 'versions' of FOPC, each with its own formal 
notion of proof (Gentzen systems, natural-deduction systems, 
machine-oriented ND systems, resolution-based, semantic tableaux, etc 
etc). They all use a similar basic notation for expressions, and they 
are all equi-expressive in some deep semantic sense, but they have 
wildly different notions of *proof*, and the translations between 
them are often very complex and wierd (eg a Gentzen system  proof can 
be viewed as a ND tree folded in half and drawn sideways, but with 
the tips written twice, more or less.) It isn't feasible to ask for 
some kind of 'universal' proof system, in spite of the underlying 
semantic similarity; and it isn't feasible to expect an engine to be 
able to check proofs written in one way using the rules for proofs 
written a different way.

What I think might be feasible is something like the logical 
equivalent of an operating system. The OS doesnt actually *run* the 
code, it just provides it with resources and looks after the 
communications. But even with a good OS, you still need APIs to get 
the programs to actually interoperate. It can't be expected to do 
magic.

> I understand that Conceptual Graphs and KIF
>are higher order logics and are equivalent.

No, they are both first-order. The new KIF will allow what look to 
the casual eye like high-order expressions, but it will still be a 
first-order logic because it won't have any higher-order inference 
rules..

Real higher-order systems are rare in AI largely because its 
impossible to unification in real HOL (you need to be able to unify 
lambda-expressions, which is nastily uncomputable), so the inference 
systems are kind of dead in the water.

Even for proof-checking you should watch out for this, by the way, as 
it could still be a problem there. If your checker needs to be able 
to check that one expresion can be unified with the antecedent of an 
inference rule, better look carefully at how smart the proof-checker 
expects the proof-writer to be, or if it might need input from 
somewhere else to help it along. Many proof-checking systems are 
human-assisted for just this kind of reason.

>I library catlog system for example may have a very
>limited inference system.  Lots of systems will be based on
>SQL databases. They will douseful work because they will
>operate on queries and datasets foming very well-defined
>and limited situations.
>
>At the other end of the scale, the web of all hte stuff these
>theings are operating on will be a paradise as fodder for
>the AI hacker.

Well, one could say that about the web as it stands today, where 
people are trying to do content scraping from the NL text on web 
pages and Google uses AI search techniques. So what will be new, if 
the semantic web needs AI?

> >  Should the language
> > have subsets at different points on the expressivity/tractability
> > spectrum?
>
>Absoluteley!
>
> > What do we pay for the ability to check proofs in the
> > Calculus of Constructions (a very hairy type system)?
> >
> > As we answer these questions, we would develop an abstract syntax for
> > the language, that is, a syntax in which tree structures are taken as
> > primitive, so parsing isn't an issue.  Finally, we would discuss ways
> > of embedding it in concrete computational structures, such as XML.
>
>(I have often felt that making a version of KIF which used URIs and
>namespaces for identifiers (webizing kif) would get this community
>off to a faster start when it came to email discussions.

But you *can* use URI's in KIF (not ANSI-KIF due to a brain-damaged 
design decision, but we are fixing that soon.) There is nothing in 
the rules of logic that prevents you from using URI's as logical 
names, in general. Why did y'all think that you needed to define a 
new language to use URI's?

>It would also make a few points about where the world is
>a tree and where a graph.  The whole interst in the SWeb is that
>it is a web -- that the same term can be referred to in formulae
>all over the planet.  There is a tension with the totally tree-structured
>nature of a parse tree, on the surface, at least.)

I don't quite see what the conflict is. The same identifier can occur 
in many parse trees in KIF (or CG's) just as well as in RDF or 
anything else.

> > Instead, we seem to have a commitment to a very low-level and clumsy
> > concrete syntax (RDF) that everyone says has bugs.
>
>The bugs are well known and can be irradicated easily.

I'm not so sure. They continue to give us problems, and the central 
use of reification in RDF is a logical black hole.

> The abstract
>syntax of triples (like lists in KIF) is cleaner.
>
>We are trying to bring together formal systems and real engineering
>which has just graduated from ASCII to XML. Its a culture thing.
>So long as the abstract langauge is clean, the syntax which others
>use for it is something which can be compromised. And you can have more than
>one syntax.  But the triples model was, as I said, a way of isolateing
>the syntax issues and semantics issues. When you reject that, you get
>the discussions linked again.

I agree with your basic point. The surface syntax isnt that important 
as long as it isnt totally dumb (like RDF). The central problem with 
RDF is still there in the triples model, though.

> >  As Dan K. points out,
> > almost none of the problems solved by RDF tools are the problems we
> > actually care about.  This shouldn't be a big surprise.  If someone
> > developed a set of tools for manipulating Sanskrit, knowing nothing
> > but the alphabet, the tools could tell you how many occurrences of the
> > letter "A" there were, but not much else.
>
>Yes.
>
> > When higher-level issues are raised, we hear offhand remarks to the
> > effect that Oh well, we can use KIF; or oh well, we can use typed
> > lambda calculus.  I don't understand why the central issues get
> > settled by offhand remarks (or usually, *not* settled).
>
>I haven't seen anyone presume these issues settled. Nor any
>presumtioin that it would be easy.  Though the KIF folks do
>often imply that their language is experssive enough to eat
>most of the other contentants as the universal one in practice.

God, I hope I'm not one of those"KIF folks". I wouldnt put forward 
KIF as anything like 'universal' , except in the rather modest sense 
that it is probably capable of expressing anything that could be said 
in any other first-order logic. The same might be said about MELD or 
SNARK or any one of dozens of syntactic alternatives.

> > Don't get me wrong.  I'm not arguing in favor of another concrete
> > syntax.  I like XML for the same reasons I like Lisp: it provides a
> > level of hierarchical structure intermediate between characters and
> > syntax trees.  But, like Lisp, XML imposes almost no constraints on
> > what actually gets expressed in it.  Great!  Why don't we define a
> > language and then find a way to embed it in XML?
>
>Ok, go ahead.
>
> >  (Step 2 should take
> > about 15 minutes.)  What exactly is the role RDF plays in all this?
>
>RDF is the triple model, plays the same role as lists in Lisp.
>To not use it woudl be like breaking a way from dotted pairs in Lisp.

Unlike RDF, LISP actually works as a coherent syntax model. The 
'triples' idea can only describe complex syntax by reifying its own 
structures, which introduces a truth-predicate into the very 
syntactic meta-description. This isn't quite directly inconsistent, 
but it is a bit like walking around with a loaded gun in your pocket.

> > How committed are we to it?  Who decides when we aren't committed to
> > it any more?
>
>When you show that your XML syntax for your universal language
>can't be expressed at all nicely as triples.
>There is an open question as to whether building on RDF means
>bulding using triples or building using triples and ohter stuff (nary
>predicates
>for example).

N-ary predicates into triples is easy. Quantifier syntax into triples 
isnt so easy. (It is *possible*, but it is truly ghastly. You have to 
map the quantifers into a variable-free relational algebra and then 
map that into triples using the n-ary -to-triples mapping. A simple 
KIF one-line expression would become several pages of RDF and be 
totally unreadable.) Negation or disjunction into triples isnt so 
easy either, without using reification.

>The first would be very neat. It would allow inference
>to be expressed in terms of triple operations. But we don't have to
>shoot ourselves in the foot to get it.
>
>In theory the questions becomes menaingless when you can translate
>any language into triples.

You CAN'T translate any language into triples!  Describing the syntax 
of a language is not the same as translating it.

>For example, DAML tells you how to
>represent lists. I converted the KIF axioms for DAML into
>DAML lists and processed them as triples.

Yes, maybe you can map one syntax into another this way  (with some 
effort, though I don't see why you would want to do it).  You can map 
anything in any RE language into strings of ones and zeros in the 
same sense.  But that is not translation.

Pat Hayes

---------------------------------------------------------------------
IHMC					(850)434 8903   home
40 South Alcaniz St.			(850)202 4416   office
Pensacola,  FL 32501			(850)202 4440   fax
phayes@ai.uwf.edu 
http://www.coginst.uwf.edu/~phayes
Received on Monday, 5 February 2001 11:07:28 UTC