Semantics, in particular DAML+OIL semantics from Ian Horrocks on 2002-08-12 (www-rdf-logic@w3.org from August 2002)

From: Ian Horrocks <horrocks@cs.man.ac.uk>
Date: Mon, 12 Aug 2002 11:51:09 +0100
To: www-rdf-logic@w3.org
Message-ID: <15703.37661.596389.537860@merlin.oaklands.net>
Here are some messages that I originally sent to the WebOnt mailing
list, and which may be relevant to the semantics debate on this list.

You can find the complete thread at [1].

Regards, Ian

[1] http://lists.w3.org/Archives/Public/www-webont-wg/2002Aug/0044.html

======================================================================
Date: Mon, 5 Aug 2002 21:30:35 +0100
To: www-webont-wg@w3.org
From: Ian Horrocks <horrocks@cs.man.ac.uk>
Subject: OWL semantics


Dear All,

Having made a plea for the resolution of the semantics issue, I would
now like to enter into the argument of what form these semantics should
take. 

I have tried to summarise below the arguments in favour of what has
become known as the "Patel-Schneider" semantics. I have tried to couch
this summary in terms that should be accessible to all members of the
WG, and I would appreciate the more technically inclined members
cutting me a little slack w.r.t. any minor imprecisions that may have
crept in as a result.


Point 1:

I would like to argue that we should KEEP THE LANGUAGE (and the
semantics) SIMPLE. In particular, we should make sure that it is (a
subset of) first order logic (FOL). I.e., when we talk about classes
we mean 1 place predicates, and when we talk about properties we mean
2 place predicates: Jane is a Person just in case our ontology entails
Person(Jane), and Jane is the mother of John just in case our ontology
entails isMotherOf(Jane,John). If the logic is first order, then the
arguments of a predicate must always be either a variable or a
constant (such as Jane).

Another way to look at this is that a model of our ontology consists
of a set of objects and a mapping function that maps individual names
(constants) to single objects, class names to sets of objects and
relation names to sets of pairs of objects. Jane is a Person just in
case that, in all models of our ontology, the object that Jane maps to
is a member of the set that Person maps to, and Jane is the mother of
John just in case that, in all models of our ontology, the pair of
objects that Jane and John map to is a member of the set that
isMotherOf maps to.

That is really all there is to it: you can explain it to pretty much
anyone. Moreover, it is part of the core curriculum of most
undergraduate computer science and maths courses, and many of the
people who will be involved in implementing the semantic web will
already be very familiar with it. Can we say the same thing about
solipsistic logic? How many of us had ever heard of it before joining
the WG, or even now have the slightest idea as to what it is all about?



Point 1a:

While it is clear that, in some cases, it is useful to be able to talk
about classes as instances, we should be clear that this means going
outside first order logic into higher order logic (HOL), because we
can now have predicates as the arguments of other predicates. If we
accept classes as instances, then even OWL Lite (which is supposed to
be "viewed by tool builders to be easy enough and useful enough to
support") will be a HOL. Some people will argue that it isn't really
higher order, but what they mean is that it may be possible (depending
on the assumed semantics of the HOL syntax) to embed the HOL in FOL by
axiomatising the HOL itself and the HOL ontology in FOL. Imagine this
conversation with a potential OWL tool builder:

Tool builder: OWL (Lite) is a higher order logic, so I will need a HOL
reasoner, right?

OWL guru: No, no. All you have to do is take a FOL reasoner, an
axiomatisation of OWL in FOL, and feed the axiomatisation, plus your
axiomatised ontology, into the FOL prover. Piece of cake really.

It is worth noting that real HOL systems, e.g., Isabelle [1] and HOL
[2], don't work via FOL axiomatisation as it is well known that, in
practice, you wont be able to prove anything this way.

It is also worth pointing out that such axiomatisations are invariably
large and complex, and that it is difficult/impossible to be sure that
they are correct. E.g., take a look at the axiomatisation of
DAML+OIL/RDF in [3], which contains around 140 axioms. FOL reasoners
can be used to detect "obvious" inconsistencies (as happened with
earlier versions of [3]), but simply ironing these out is a LONG way
from proving that the axiomatisation correctly captures the meaning of
the language.


Point 1b:

It should NOT be possible to use OWL/RDF statements to constrain the
meaning of OWL/RDF syntax. E.g., if myTransitive is a sub-class of
transitiveProperty, then instances of myTransitive should NOT be
treated by OWL as transitive properties. If we go down this road, then
we need a (HOL) reasoner even to parse the syntax of an OWL ontology,
to determine what is being said, and to check that it is valid OWL.

For example, when we parse an OWL ontology we may find that instead of
using the familiar subClassOf property, it contains lots of statements
like "Person foo Animal". If we allow statements in the ontology to
constrain the meaning of the syntax, then we may be able do deduce
that foo is equivalent to subClassOf, and that this is therefore a
meaningful OWL statement. The reasoning required for this deduction
may be extremely complex. It may even be IMPOSSIBLE to be sure that we
have derived the complete syntactic meaning of an OWL ontology
(because the language is undecidable).

Another example. In OWL, transitive properties cannot be used in
cardinality restrictions. If we allow inference to be used to deduce
that a property is a transitive property, then when we parse an OWL
ontology we can't be sure that it is valid until we have checked that
none of the properties used in cardinality constraints can be deduced
to be transitive. Again, the required reasoning may be very complex,
and it may even be IMPOSSIBLE to be sure that the ontology is
syntactically valid. One can easily imagine situations where the
inference that a property is transitive depends on a cardinality
restriction involving the property itself, which then becomes a
syntactically invalid statement, so the inference is no longer valid
(a paradox).

To return again to our poor benighted tool builder, can you imagine
the reaction we will get when we try to explain that it is impossible
to parse an OWL ontology and check it for syntactic validity without
employing the services of a (possibly higher order) reasoner?


Point 2:

If we decide to go with some form of higher order logic and/or allow
reasoning about syntax, then all our efforts to design OWL so as to be
relatively simple, and hopefully even implementable, are a (bad)
joke. Even the so called Lite version of OWL would be impossibly
complex to deal with, as I hope I showed in some of the above
examples.



I guess that is enough to be going on with.

Regards, Ian


[1] http://www.cl.cam.ac.uk/Research/HVG/Isabelle/index.html
[2] http://www.cl.cam.ac.uk/Research/HVG/FTP/FTP.html
[3] http://www.w3.org/TR/daml+oil-axioms


======================================================================
Date: Tue, 6 Aug 2002 10:19:00 +0100
To: "Jos De_Roo" <jos.deroo.jd@belgium.agfa.com>
Cc: www-webont-wg@w3.org
From: Ian Horrocks <horrocks@cs.man.ac.uk>
Subject: Re: OWL semantics


On August 6, Jos De_Roo writes:
> 
> [only a very partial reply]
> 
> [...]
> 
> > It is also worth pointing out that such axiomatisations are invariably
> > large and complex, and that it is difficult/impossible to be sure that
> > they are correct. E.g., take a look at the axiomatisation of
> > DAML+OIL/RDF in [3], which contains around 140 axioms. FOL reasoners
> > can be used to detect "obvious" inconsistencies (as happened with
> > earlier versions of [3]), but simply ironing these out is a LONG way
> > from proving that the axiomatisation correctly captures the meaning of
> > the language.
> 
> that is not enough to suggest a
>   "impossible to be sure that they are correct"
> let's just call it difficult/challenging
> engineering and no more

We have invested a lot of time in discussing OWL Lite on the grounds
that the full language would be too difficult for many implementors to
deal with. Can you imagine the conversation with one of these
implementors:

Implementor: Where is the specification of the semantics?

OWL guru: We believe that the intended semantics of the language is
captured in this large set of axioms.

Implementor: You believe?

OWL guru: Yes. Proving that they really do capture the intended
semantics is a difficult/challenging engineering problem.


Moreover, I would be interested in your suggestions as to how one
might go about proving that such an axiomatisation is correct. As you
refer to engineering, I presume that this means testing. Testing wont
PROVE anything (there are infinitely many possible inputs), it can
only show that you didn't find any errors yet. In fact, given that the
axiomatisation is in FOL, which is undecidable, even inputs that would
in principal lead to errors may not reveal anything due to
incompleteness in the FOL reasoner.

Regards, Ian


> 
> -- ,
> Jos De Roo, AGFA http://www.agfa.com/w3c/jdroo/



======================================================================
Date: Thu, 8 Aug 2002 14:47:12 +0100
To: "Christopher Welty" <welty@us.ibm.com>
Cc: www-webont-wg@w3.org, www-webont-wg-request@w3.org
From: Ian Horrocks <horrocks@cs.man.ac.uk>
Subject: Re: OWL semantics


On August 6, Christopher Welty writes:
> 
> WIth regards to Ians two basic points:
> 
> 1a) Classes as instances:
> 
> This is, to me, a fascinating issue, and I'm not willing to take Ian and 
> Peter's word on it yet.  It is certainly a useful feature to have.  Pat 
> and Chris Menzel worked out a way to make the semantics of this 
> first-order for KIF/CL/whatever it's called now, however to my knowledge 
> they have never shown any of the (algorithmic) consequences for reasoning. 
>  Pat shrugs and says it should be no problem since it's all first order, 
> and the syntax prevents paradoxes.  I'm not willing to take their word for 
> it, either.

Allowing classes of classes cannot be done directly in standard first-order
logic while retaining a natural correspondence between classes and
predicates.  Either changes have to be made to first-order logic, perhaps
in the way proposed by Pat Hayes and Chris Menzel, or the instance-class
relationship has to be some other relationship than membership in the
extension of a predicate.  

These changes do not, however, come for free.

The first change requires the development of a different logic.  Pat Hayes and
Chris Menzel have embarked on this path, but with a different logic comes
the possibility of changes in the characteristics of the logic as well as 
requirements for different tools to work with the logic.

The second change interferes with the natural embedding of OWL into FOL, an
embedding that provides considerable benefits.  In other words, instead of
OWL really being a portion of FOL, OWL would be a language whose meaning we
could describe using FOL.

Some of the consequences of losing a natural embedding are that:

1. Instead of OWL mapping directly to a fragment of FOL with relatively
nice computational properties, the meaning of OWL would be described using
(at least) full FOL.

2. Instead of relying on the built in machinery of FOL to deal with
intersection, union, negation, etc., it would be necessary to treat them as
standard predicates whose meaning is described by the axiomatisation.

As we have already discussed, it is not easy to be sure that the
axiomatisation correctly captures the meaning of the language. Moreover,
even if we believe that it is correct, using the axiomatisation can be very
costly form a computational perspective: instead of reasoning directly with
boolean connectives, for example, we have to reason with an axiomatisation
of their meaning. Many of the optimisations built into existing FOL
reasoners would be rendered ineffective as they would not be able to
recognise and exploit the properties of the axiomatised connectives.


> Applying this result (assuming there is an algorithm) to OWL may not be 
> trivial, however, since much computation in description logics occurs over 
> the class descriptions - even if it the reasoner is model based.  So, at 
> least from what I understand, a lot hinges on the fact that the extension 
> of a class does not include other classes.  Still, what are the 
> consequences of including classes in the extension of classes?  An 
> inference may add more to the description of a class?  As long as that 
> addition is monotonic, is it really that bad?
> 
> So Ian, Peter, are you objecting prima facie to the fact that "classes of 
> classes" is inherently second order, or do you have examples in mind that 
> would break the reasoner?

I have no idea how to build a "direct" reasoner for such a language (I
could, of course, use the axiomatisation approach), so I can't say how
such a reasoner might be broken. The fact that inferences are
monotonic doesn't really help much - we have to be sure that we are
adding all and only correct inferences, and that we are not going to
add infinitely many inferences.

Regards, Ian


> 
> 1b) Using OWL to change the syntax of OWL
> 
> I strongly agree with Ian's position here.   It is well known that when 
> you use a logical system to *talk about itself* you are introducing 
> incompleteness.  This was part of Godel's theorem.  In every system that 
> permitted this I've ever seen, it was a trivial matter to create 
> incomplete and infinite proofs, so trivial that naive users did it all the 
> time.  Unless you break processing into two discrete steps, the way 
> compilers/loaders do (i.e. compile time and run time) you are also 
> introducing a huge amount of machinery to support this.  Machinery that no 
> one really knows how to build. 
> 
> Adding such a class of  features reduces the space of possible 
> implementors for the full language from a handful to none.  It may be 
> interesting to create that as "OWL level three" and make it a challenge 
> problem for Computer Science.
> 
> -ChrisW
> 
> 
> Dr. Christopher A. Welty, Knowledge Structures Group
> IBM Watson Research Center, 19 Skyline Dr.
> Hawthorne, NY  10532     USA 
> Voice: +1 914.784.7055,  IBM T/L: 863.7055
> Fax: +1 914.784.6078, Email: welty@us.ibm.com



======================================================================

Date: Fri, 9 Aug 2002 15:48:17 +0100
To: "Jonathan Borden" <jonathan@openhealth.org>
Cc: <www-webont-wg@w3.org>
From: Ian Horrocks <horrocks@cs.man.ac.uk>
Subject: Re: SEM: integration with RDF MT


On August 8, Jonathan Borden writes:
> 
> This back and forth between axiomatic and model theoretic semantics is
> confusing me a bit. I take it that the RDF model theory is being developed
> as a model theoretic semantics. Pat Hayes and Guha have proposed Lbase as a
> framework to integrate various model theories for semantic web languages. Is
> this the right question?

It may be interesting from a theoretical perspective, but it wont
change any of the issues we have been debating or solve any of the
problems.

> Does the choice of axiomatic vs. model theoretic semantics affect this? Is
> it possible to extend a model theoretic semantics via axiomatization? It is
> possible to develop the OWL semantics as an extension of Lbase? How are the
> RDF and OWL semantics intended to be related?

The question we need to ask ourselves is what we want semantics for
and what we want to do with them. My idea is that they should be a
declarative specification of the meaning of the language that is both
simple enough for people to understand and agree that it really does
capture the intended meaning, and precise enough to act as yardstick
against which algorithms and implementations can be measured.

If we choose a language that has a natural mapping to standard FOL then we
can satisfy these requirements by using either (or both) a standard MT or a
direct mapping into standard FOL, i.e., a mapping where classes
(resp. properties) map to unary (resp. binary) predicates, intersection
maps to and, union maps to or, negation maps to not, etc.

If we choose a language without a natural mapping to standard FOL, e.g., a
language that attempts a complete integration with RDF and includes classes
of classes, then our job becomes much tougher.  We can devise a MT for the
language, one that extends the RDF MT, but it will be much more complex,
e.g., including comprehension principles.  We may be able to devise a
complete axiomatisation of the language in FOL, but the axiomatisation will
be complex.  We may even be able to map the language into a different
logic, such as Lbase, but then we would be dependant on the properties of
this other logic.

Apart from the fact that it may not even be possible to generate a coherent
specification (i.e., one free from paradoxes), the difficulty with the
first two of these is that they become sufficiently complex that it is no
longer easy to be sure that they correctly capture the intended meaning. We
then have the problem Chris described, where we have to be concerned not
only with possible errors in our implementations of the specification but
also with possible errors in our model of the specification.

Regards, Ian


> 
> Jonathan
Received on Monday, 12 August 2002 06:53:31 UTC