Re: Where DAML+OIL deviates from the RDF-Schema spec. from Deborah Mcguinness on 2001-03-03 (www-rdf-logic@w3.org from March 2001)

From: Deborah Mcguinness <dlm@ksl.stanford.edu>
Date: Sat, 03 Mar 2001 12:04:55 -0800
To: Ian Horrocks <horrocks@cs.man.ac.uk>
CC: Jim Hendler <jhendler@darpa.mil>, Dan Brickley <danbri@w3.org>, Tim Berners-Lee <timbl@w3.org>, Frank van Harmelen <Frank.van.Harmelen@cs.vu.nl>, Graham Klyne <Graham.Klyne@MIMEsweeper.com>, www-rdf-logic@w3.org
Message-ID: <3AA14E67.4BB1192F@ksl.stanford.edu>
Just a few empirical observations on loops -

- I crawled a number of "naturally occuring" web taxonomies about 1.5 years ago including
things like yahoo shopping, topica, lycos, amazon, etc.   I was surprised to find that
most of them did include cycles.
I never confirmed with the authors of the taxonomies, but in my opinion, none of the
cycles were intended.
My point here is just to provide empirical support that cycles exist with regularity in my
experience.
arguably, many of those taxonomies were built by people without extensive KR training
but just incase people want other data...
- Chimaera reports cycles in our diagnostic tests.  We have run many ontologies through
chimaera and detected a number of cycles that were not intentional in ontologies built by
people with some training in knowledge representation.
- There were some small number of cases where cycles were detected and presented back to
the original KB authors and the authors defended their modeling choice to use cycles.  One
in particular that I remember were the classes of interest and motivation in one govt kb
(built by quite literate KR people).  They defended that they wanted two different names
yet were using the same definitions for the terms.  (One could argue with this decision
but ultimately we want to support conceptual modeling that is natural to designers and in
this case, the authors had spent time on their conceptual model and were convinced that
this was the choice that met their needs best.)

Thus, I support the notion particularly for widely used ontologies that:
1 - cycles occur in practice
2 - sometimes (rarely in my subjective view) people really want to keep the cycles
3 - a kb should not be considered broken because it includes a cycle
4- tools should be provided that warn users of cycles and suggest repair strategies if
possible.

Deborah

Ian Horrocks wrote:

> On March 1, Jim Hendler writes:
> > At 1:37 PM -0500 3/1/01, Dan Brickley wrote:
> > >I agree with [2] and [3], and could live with [1]. My main concern w.r.t.
> > >using loops in the class and property hierarchies to indicate synonyms is
> > >with end-user comprehensibility and with user interface generation. I can
> > >see that there's a _logical_ story to tell about why loops are OK; I'm not
> > >so sure there's a modelling and usability story. But then it's not up to
> > >the core RDFS system to guarantee that folk can't make goofy modelling
> > >decisions, I guess.
> > >
> > >Dan
> > >
> >
> > I'm with DanB on this one.  I originally proposed that if we wanted
> > to use loops to assert equality, we'd lose the ability to (1)
> > distinguish intentional from unintentional loops, and (2) force
> > developers to understand the logical relationship (which I know from
> > experience ain't always easy). In fact,
> > I was reminded later by one of my former postdocs that this situation
> > has come up in practice in our experience -- we developed some of our
> > KBs for the Parka-DB project using web-scrapers from online
> > taxonomies and thesauri.  We saw a number of cases where loops
> > existed - unintentionally, so we had to write loop breaking code in
> > our systems -- under [1] we'd have trouble distinguishing the
> > accidental from the intentional.
> >
> >   I argued we should have a language construct that was explicit to
> > assert equality.  I was overruled
>
> Err, hang on a minute. Let me quote from daml+oil.daml:
>
> <Property ID="sameClassAs">
>   <rdfs:label>sameClassAs</rdfs:label>
>   <comment>
>     for sameClassAs(X, Y), read X is an equivalent class to Y.
>     cf OIL Equivalent
>   </comment>
>   <rdfs:subPropertyOf rdf:resource="#equivalentTo"/>
>   <rdfs:subPropertyOf rdf:resource="http://www.w3.org/2000/01/rdf-schema#subClassOf"/>
>   <rdfs:domain rdf:resource="#Class"/>
>   <rdfs:range rdf:resource="#Class"/>
> </Property>
>
> So we don't want/need to use subClassOf cycles to assert
> equality. However, we have to decide what to do with such cycles when
> they are detected - it is no use just saying that they are "forbidden",
> because by the standard argument they will occur out there on the
> web. The choices are either:
>
> a. barf - declare the ontology to be illegal/broken
>
> b. accept the facts as presented and draw the appropriate conclusions
> (and possibly issue some kind of warning as to the consequences).
>
> I much prefer b. Here are just a few reasons:
>
> 1. There is no semantic justification for declaring such an ontology
> to be broken - we are just guessing that it contains one or more
> errors. This seems to be setting a very dangerous precedent - we could
> declare many other DAML+OIL constructions illegal on the grounds that
> they are often used in error.
>
> 2. It is not, in general, possible to "fix" cyclical ontologies
> automatically as there is no way to know where the cycle should be
> broken. So it is hard to do anything sensible when a cycle is detected
> other than to just reject the ontology.
>
> 3. Given that we have to detect cycles anyway, surely it is much
> better to deal with them and simply warn users that a two or more
> classes have collapsed into a single class.
>
> 4. In DAML+OIL we can easily create ontologies with cycles that
> include implicit subClassOf relations that cannot be detected
> syntactically. Are these kinds of ontology also illegal?
>
> Ian
>
> > on the grounds that having two ways
> > to do the same thing in a language was bad (Dan Connolly) and that
> > DAML+OIL had to have the cycles for their associated logic engine  to
> > handle this stuff (Ian Horrocks/Peter Patel-Schneider) -- as a
> > result, my personal theorem prover was faced with
> >   a. Two equivalent solutions are bad
> >   b. We had to have solution [1]
> > and therefore, using the stuff that will power the semantic web
> > I was forced to conclude
> >   c. We should have [1] and only [1]
> >
> > If someone is willing to remove assumption a or b, I would think we'd
> > end up with something that makes more sense to the logically
> > challenged folks (like me) who really want to use this stuff do to
> > "frame"-like reasoning more than logical inference, or who would make
> > errors that would cause whole chains of subclass relationships to
> > collapse by accident.
> >    cheers
> >   Jim H
> > p.s. Parka-DB: http://www.cs.umd.edu/projects/plus/Parka/parka-db.html
> >
> >
> > At 11:53 PM +0100 2/24/01, Frank van Harmelen wrote:
> > >
> > >[1]
> > >"Warning: The RDF Schema specification demands that the
> > >subclass-relation between classes
> > >must be acyclic. We believe this to be too restrictive, since a
> > >cycle of subclass
> > >relationships provides a useful way to assert equality between
> > >classes. Consequently,
> > >DAML+OIL places no such restriction on the subClassOf relationship
> > >between classes;"
> >
> > Dr. James Hendler             jhendler@darpa.mil
> > Chief Scientist, DARPA/ISO    703-696-2238 (phone)
> > 3701 N. Fairfax Dr.           703-696-2201 (Fax)
> > Arlington, VA 22203           http://www.cs.umd.edu/~hendler
> >

--
 Deborah L. McGuinness
 Knowledge Systems Laboratory
 Gates Computer Science Building, 2A Room 241
 Stanford University, Stanford, CA 94305-9020
 email: dlm@ksl.stanford.edu
 URL: http://ksl.stanford.edu/people/dlm/index.html
 (voice) 650 723 9770    (stanford fax) 650 725 5850   (computer fax)  801 705 0941
Received on Saturday, 3 March 2001 15:04:50 UTC