Re: Integrity constraints in DAML-O (was: Chainsaw?) from Graham Klyne on 2000-11-03 (www-rdf-logic@w3.org from November 2000)

From: Graham Klyne <GK@Dial.pipex.com>
Date: Fri, 03 Nov 2000 17:08:24 +0000
To: Sergey Melnik <melnik@db.stanford.edu>
Cc: RDF interest group <www-rdf-interest@w3.org>, RDF Logic <www-rdf-logic@w3.org>
Message-Id: <4.3.2.7.2.20001103141628.00de1b50@pop.dial.pipex.com>
Sergey,

Thanks for your thoughts -- this is great!

I don't have any final answers (shame ;-) but I think I am seeing ways that 
I can proceed, without (hopefully) trampling too much on the toes of 
existing work.  My response below is in four main parts:
1. an immediate response to your suggestions,
2. a wider comment on some of the logical (and ontological?) issues that 
they touch,
3. an attempt to identify the different but overlapping goals of my work 
and DAML-O
4. an outline of how I propose to progress my ideas


1. an immediate response to your suggestions
--------------------------------------------
For me, your suggestions underlined the value of being able to introduce 
"bound variables" to describe complex relationships.  The use of variables 
in this way frees the logical descriptions from having to exist in 
specified structural relationships to each other.

I had already been coming to a view that some kind of "local variable" or 
"paramerization" framework was going to be required.  This line of 
development seems to parallel parameterized types (aka templates, generics, 
etc.) in conventional programming languages.

It seems that the scoping of such variables is an interesting 
issue;  following Pat Hayes' comments [3], a very flexible approach to 
scoping (including allowing them to appear free in a model) provides a 
basis for social as well as purely logical binding.

I also note that Sergey's suggestion distinguishes between 'rdf:type' and 
'damlO:restrictedBy' (which I would also have done automatically until very 
recently).  I'll pursue this theme in (2) below in comments about about types.


2. a wider comment on some of the logical (and ontological?) issues that 
they touch
-----------------------------------------------------------------------------------
I've had John Sowa's book on KR [1] sitting on my bookshelves unread for 
too long, and have just started to actually read it.  And, for me, it is 
veritably teeming with new perspectives and ideas.  There are a couple in 
particular, which are probably obvious or old hat to KR veterans, that I'd 
like to draw out because I found them useful.

I have tended to view "type" in programming language terms as a "set of 
values" (and an associated algebra).  Sowa points out that there are 
intensional (intrinsic meaning or associated concept) and extensional 
(denotation, collection of values), and that the two are not always 
interchangeable.

A "type" can alternatively be viewed as a monadic predicate, which is true 
of any instance of that type.  Also allow that any instance can be of more 
than one type (an idea I have been slow to appreciate).  This intensional 
approach to types can create distinctions and represent relationships that 
are not present in the extensional, set-oriented approach.  Adopting this 
view, I think that Sergey's use of 'damlO:restrictedBy' suggestion can be 
seen as simply another way to define (or refine) a type.

The other thing that struck me from Sowa's presentation was the duality 
between predicate logic and conceptual graphs (CG).  In particular the 
correspondence between bound (quantified) variables in predicate logic, and 
any-to-any concept relationships in a CG.  This parallels something I've 
noticed about RDF:  resource identifiers are used to create the arbitrary 
conceptual linkages that RDF can express.  But resource identifiers are 
_also_ used to identify global entities, which (I understand) are expected 
to be globally bound.  This leads me to think that the use of resource 
identifiers in RDF may be overloaded -- in saying this, I think I'm echoing 
the points that Pat Hayes has raised about "proper names" [3].


3. an attempt to identify the different but overlapping goals of my work 
and DAML-O
-----------------------------------------------------------------------------------
My understanding is that DAML-O is attempting to create a framework in 
which "web-proofs" about things in the web, and things described by the 
web, can be verified.  Thus, the emphasis is on logical inference and 
verification capabilities.

My work here is motivated by a desire to express complex relationships in 
RDF in a way that can be used to support certain inference patterns.  The 
emphasis here is on expressivity.  And in particular, I want to be able to 
describe things without necessarily knowing the full ontological structures 
involved.  So I want to say things about the engine and body of my car, 
without actually specifying exactly how they are related to my car.  (Maybe 
I'm unwittingly trying to adopt a Mereological approach to ontology here, 
rather than set theory based?  c.f. Sowa's book [1] section 2.6.)

I see no reason to think that these goals are mutually exclusive.  But the 
problems are not easy, and I think that trying to solve too many (both) at 
once is too much for me to handle.  Smarter minds than mine are focusing on 
the logical side.

Thus, I plan to pursue the ideas I started in my paper on contexts [2], 
taking on board some of the ideas that have come out more recently, 
maintaining my focus on expressivity rather than standardized 
inference.  In particular, I want to be able to construct models 
incrementally, using whatever information may be to hand;  this means they 
must be expressible based on partial information, and not need 
restructuring to accommodate new information as it comes available.  Later, 
I hope to understand how this relates to the logical/ontological work of 
DAML, and revise the ideas accordingly.

Ideally, my work may be fully expressible (logically equivalent to or 
subsumed by) the analytical framework of DAML, but having a different 
approach allowing incremental expression (don't need to know the full 
ontological structure to make useful statements).


4. an outline of how I propose to progress my ideas
---------------------------------------------------
With help from various folks at Bristol, I have already restructured my 
contexts ideas [2] to somewhat separate model, schema and logical 
elements.  The core construct is a container of reified statements.  Other 
parts build on that.  But I stalled when I tried to put in examples of the 
kind of  "prototyping" I envisage into the document, and the restructuring 
is only part done.

I think a "proper name" construct can be introduced without extending or 
bending RDF as something like this:

    [AnyResource]--properName-->[(anon?)] --rdf:Type--> [ProperName]
                                [       ] --value-----> "(name string)"

This isn't especially pretty, but I think it can be made to work within 
current RDF, and is simple enough for a front-end to hide the 
messiness.  If some other mechanism is introduced that works for proper 
names (e.g. DanB's thoughts on anonymous resources) that can be substituted 
later.

I'd also add a way for statements to indicate scoped equivalences between 
different proper name instances.

Unless someone points out a major oversight in this, I'll turn my 
attentions back to the paper and try and  pull it together into something 
we can prototype.


Thanks,

#g
--

[1] John F. Sowa, "Kowledge Representation;  Logical Pholosophical and 
Computational Foundations", Brooks/Cole 2000, ISBN 0 534-94965-7.

[2] (My discussion paper on contexts for information modelling in RDF - 
currently with many loose ends.) 
http://public.research.mimesweeper.com/RDF/RDFContexts.html.

[3] Comments about "proper names" (was "public names") by Pat Hayes:
<http://lists.w3.org/Archives/Public/www-rdf-logic/2000Oct/0112.html>, 
<http://lists.w3.org/Archives/Public/www-rdf-logic/2000Oct/0122.html>.



At 06:23 PM 11/2/00 -0800, Sergey Melnik wrote:
>Sorry for crossposting, I believe this topic is directly related to
>DAML-O.
>
>Graham Klyne wrote:
> >
> > Folks,
> >
>[...]
> > I'm still having problems with finding a sufficiently flexible
> > mechanism to bind "prototype" statements into an instance of some
> > class.  The solutions I've seen posted so far assume that the statements
> > apply directly to an instance of the class in which they are defined.  I
> > think this is too restrictive for defining complex relationships and
> > prototyping structures;  e.g. Tom (if Iunderstand correctly) has proposed
> > something like:
> >
> >     [FordEscort] --rdf:type---> [rdfs:Class]
> >     [          ] --bodyStyle--> "HatchBack"
> >     [          ] --fuelType---> "Petrol"
> >
> > and
> >
> >     [MyCar] --rdf:type--------> [FordEscort]
> >     [     ] --bodyColour------> "Red"
> >     [     ] --engineCapacity--> "1600"
> >
> > to create a description meaning something like:
> >
> >     [MyCar] --bodyStyle-------> "HatchBack"
> >     [     ] --fuelType--------> "Petrol"
> >     [     ] --bodyColour------> "Red"
> >     [     ] --engineCapacity--> "1600"
> >
> > This simple approach works only when the properties defined in the
> > [FordEscort] prototype are applied *directly* to an instance of that
> > prototype.  In some of the modelling work we have tried to do, this kind of
> > direct linkage of all properties to an instance of a type is too
> > constraining to be practically useful.  This is one of my motivations for
> > trying to use contexts, so that I can create a high-level description of
> > MyCar and subsequently refine the parts:
> >
> >     [MyCar] --rdf:type--> [FordEscort]
> >     [     ] --asserts--->
> >       {
> >       [TheBody] ----colour----> "Red"
> >       [TheEngine] --capacity--> "1600"
> >        :
> >       (etc.)
> >       }
> >
> > (Using here the notation "[<context>] --asserts--> {<StatementSet>}" to
> > capture the idea of a collection of reifications of statements that are
> > asserted to be true in the subject context.)
> >
> > This structure allows us to make statements about entities that may be
> > directly or indirectly related to [MyCar], without necessarily having to
> > know up-front the nature of that relationship.  It allows us to make
> > statements about what we know, without having to make up arbitrary (and
> > probably flawed) statements about what we don't know.  It is my conviction
> > that enabling such an incremental approach to information modelling will be
> > extremely powerful for dealing with complex relationships.
> >
> > So, what's missing?  It comes back to defining ways to relate prototype
> > statements to a specific instance of a class.  Brian has made an
> > imaginative proposal, based on a "logic" of type matching between the
> > prototype and the instance context.  I feel this is still to restrictive.
> >
> > I am thinking about the discussion of "anonymous" resources, and also
> > comments made by Pat Hayes on the RDF-logic list
> > <http://lists.w3.org/Archives/Public/www-rdf-logic/2000Oct/0112.html>,
> > <http://lists.w3.org/Archives/Public/www-rdf-logic/2000Oct/0122.html>.  I
> > think Pat has it right:  that what is required is a way to introduce
> > "proper names" that have a common (but unspecified) referent within some
> > range of use (not lexically defined), and which may be bound to different
> > globally unique names (URIs) in different contexts.
> >
> > This would lead to my example becoming something like this:
> >
> >     [FordEscort] --rdf:type---> [rdfs:Class]
> >     [          ] --rdf:Type---> [Context]
> >     [          ] --asserts---->
> >       {
> >       [Body] ----style-----> "Hatchback"
> >        :
> >       [Engine] --fuelType--> "Petrol"
> >        :
> >       (etc.)
> >       }
> >
> >     [MyCar] --rdf:type--> [FordEscort]
> >     [     ] --asserts---> [FordEscort]
> >     [     ] --asserts--->
> >       {
> >       [Body] -------bindTo----> [TheBody]
> >       [Engine] -----bindTo----> [TheEngine]
> >        :
> >       [TheBody] ----colour----> "Red"
> >       [TheEngine] --capacity--> "1600"
> >        :
> >       (etc.)
> >       }
> >
> > (The presentation is clumsy;  at this time I am merely trying to illustrate
> > the idea of binding proper names used in one context to specific resources
> > described in another.)
> >
> > Here, the property 'asserts' with a context as its object is used to state
> > that all statements asserted in the object context are also asserted in the
> > subject context.  The 'bindTo' property is used to assert an equivalence
> > between two names (within the context containing that assertion).
> >
> > All this presupposes the introduction of a resource identifier form along
> > the lines of "proper names", which may be regarded as a departure from
> > conventional RDF/WEB thinking.
> >
> > #g
> > ------------
> > Graham Klyne
> > (GK@ACM.ORG)
>
>Graham,
>
>I completely agree with you that having a schema-based mechanism for
>specifying default values for class instances (similar to static
>variables in languages like C++ and Java) is extremely useful. In fact,
>DAML-O has a limited facility for expressing that.
>
>Your first example could be addressed in DAML-O by defining a
>hasValue-Restriction on property bodyStyle for class FordEscort (this
>roughly corresponds to declaring a static final variable for a Java
>class). However, in my understanding, the more complex examples that you
>mentioned can be tackled in DAML-O only in a relatively clumsy way.
>You'd have to define classes like FordEscort_Body and FordEscort_Engine,
>require that every FordEscort has exactly one FordEscort_Body and
>FordEscort_Engine, and attach corresponding hasValue-Restrictions to
>FordEscort_Body and FordEscort_Engine.
>
>Moreover, there are a lot of situations where it gets even clumsier and
>specialized subproperties are required to express schema constraints.
>For example, imagine we want to make sure that every car has exactly
>four wheels and that every wheel has a wheelID that identifies the
>location of the wheel (e.g. "FL", "FR", "RL", "RR"). Furthermore, a car
>must have at most one front left, rear left etc. wheel.
>
>As far as I can see, the only way to express it in DAML-O would be to
>create 4 subclasses of Wheel (WheelFL, WheelFR, WheelRL, WheelRR) with
>corresponding hasValue-Restrictions and 4 subproperties of hasWheel
>(hasWheelFL, hasWheelFR, hasWheelRL, hasWheelRR). Obviously, this
>approach is far from elegant. (Is there another one?)
>
>Another option would be defining rules in logical constraint language
>(DAML-L?) to handle cases like that. The wheel example could be
>specified as
>
>[Car] --rdf:type--> [rdfs:Class]
>[Car] --damlO:restrictedBy --> [R001]
>[R001] --rdf:type--> [damlL:Expression]
>
>whereas R001 would represent a conjunction of primitive existential
>expressions like
>
>X  rdf:type Car
>X  hasWheel W1
>X  hasWheel W2
>X  hasWheel W3
>X  hasWheel W4
>W1 wheelID "FL"
>W2 wheelID "FR"
>W3 wheelID "RL"
>W4 wheelID "RR"
>
>Combined with cardinality restrictions on wheelID [1, 1] and hasWeel [4,
>4], the above constraint would help to avoid creating 8 unnecessary
>language elements (4 classes and 4 properties). Your second example
>
> >     [FordEscort] --rdf:type---> [rdfs:Class]
> >     [          ] --rdf:Type---> [Context]
> >     [          ] --asserts---->
> >       {
> >       [Body] ----style-----> "Hatchback"
> >        :
> >       [Engine] --fuelType--> "Petrol"
> >        :
> >       (etc.)
> >       }
>
>could be expressed similarly as
>
>[FordEscort] --rdf:type--> [rdfs:Class]
>[FordEscort] --damlO:restrictedBy --> [R002]
>[R002]       --rdf:type--> [damlL:Expression]
>
>with R002:
>
>X rdf:type FordEscort
>B rdf:type Body
>B style    "Hatchback"
>E rdf:type Engine
>E fuelType "Petrol"
>
>(As you suggested, the relationship between the car, its body and its
>engine is not made explicit).
>
>
>The eternal problem with integrity constrains is the complexity. It
>should be trivial *both* to check the consistency of models with respect
>to a given set of schemas *and* to support graphical editing tools that
>could, for example, automatically instantiate objects whose existence is
>implied by the schema.
>
>It looks to me that cardinality and conjunctions of existential
>statements is both relatively expressive and easy to deal with.
>
>Would it make sense to have this capability in DAML-O? Should this go
>into DAML-L?
>
>What should implementors use right now in their ontologies?
>
>Sergey

------------
Graham Klyne
(GK@ACM.ORG)
Received on Friday, 3 November 2000 12:21:06 UTC