Re: AS & S review: RDFS-compatible OWL semantics from herman.ter.horst@philips.com on 2003-01-20 (www-webont-wg@w3.org from January 2003)

From: <herman.ter.horst@philips.com>
Date: Mon, 20 Jan 2003 15:32:22 +0100
To: "Peter F. Patel-Schneider" <pfps@research.bell-labs.com>
Cc: www-webont-wg@w3.org
Message-ID: <OFFCF723B8.653B0453-ONC1256CB4.00469F76-C1256CB4.00500B80@diamond.philips.com>
> From: herman.ter.horst@philips.com
> Subject: AS & S review: RDFS-compatible OWL semantics
> Date: Fri, 17 Jan 2003 17:13:06 +0100
> 
> > Here is the next part of my review comments on the Semantics 
> > document:
> > Section 5 RDFS-Compatible Model-Theoretic Semantics
> > Version of 16 January

[...]

> > 
> > The first point that I want to raise is that many 
> > small, additional assumptions need to be made on an 
> > OWL interpretation in order to assure that each use that
> > is made of the functions EXTI and CEXTI can really be made.
> > In order to explain this, note that the current text does not 
> > start from a suitable summary of the RDF Semantics. 
> > In the second and third paragraph of Section 5.2,
> > an RDFS interpretation is first described as a three-tuple,
> > and then, in a "datatyped" version, as a four-tuple.
> > However, an RDFS interpretation, datatyped or not, is currently
> > described as a five-tuple, including a set of properties PI as
> > part of the basic definition.  The domain of the function EXTI
> > is not all of RI, but only the set PI.  This change was made in 
> > November (when the RDF Model Theory was renamed to RDF Semantics;
> > the OWL AS & S document still speaks of RDF MT).
> 
> I've made the appropriate change, to something very close to what you 
say
> below.
> 
> > A related point is that the domain of the function CEXTI consists
> > only of the set of classes CI, which may also not be all of RI 
> > (this was already the case with the RDF MT version of April).
> 
> This is not correct.  ICEXT is still defined for all resources, even for
> literals.  This may be a bug in the RDFS semantics.

ICEXT is defined for literals/datatypes.
This is an explicit policy of the RDF Semantics: see the definition
of D-interpretation, in particular the fourth condition in the table 
and its explanation in the text below.

But ICEXT is not necessarily defined for all resources.
In November, I pointed to a seeming circularity in the definitions
in the RDF MT, between the notions IC and ICEXT [1].
Subsequently, the text was slightly expanded and does not leave
any room for doubt: IC is defined first, and ICEXT is
defined in terms of IC. 
I quote from the most recent version of the RDF Semantics at [2]:

"Although not strictly necessary, it is convenient to state the 
RDFS semantics in terms of a new semantic construct, a 'class', 
i.e. a resource which represents a set of things in the universe which 
all have that class as the value of their rdf:type property. 
Classes are defined to be things of type rdfs:Class. 
We will assume that there is a mapping ICEXT (for the Class Extension 
in I) from classes to their extensions; the first semantic condition 
in the table below amounts to the following definition of this mapping 
in terms of the relational extension of rdf:type:
  ICEXT(x) = {y | <y,x> is in IEXT(I(rdf:type)) }  "



> 
> > In view of this, the given summary of RDF semantics should be 
> > replaced by an up to date and somewhat more extensive summary.
> > Let me briefly summarize the basic definition of the RDF semantics,
> > in order to be able to describe the additional assumptions that
> > need to be made in the OWL semantics, and in order to facilitate 
> > the replacement of the given summary of the RDF semantics. 
> > I use the slight adaptation made by Peter of the original notation 
> > of Pat, however without making many final I's a subscript, of 
> > course.
> > An RDFS interpretation of a vocabulary V is a five-tuple consisting 
> > of:
> > - a set RI (the universe)
> > - a set PI subsetOf RI
> > - a function EXTI : PI -> P(RI x RI) 
> > - a function SI : V -> RI
> > - a function LI : {typed literals} -> RI
> > satisfying many special conditions specified in the RDF Semantics.
> > (By the way, referring to an earlier part of this review,
> > note that the P(X) notation for power set is very convenient here.)
> > 
> > Given such an RDFS interpretation, the set of classes is defined
> > to be
> > CI := {x in RI: <x,SI(rdfs:Class)> in EXTI(SI(rdf:type))}.
> > This set is defined to be the domain of the function
> > CEXTI : CI -> P(RI) 
which function is defined by:
> > CEXTI(c) := {x in RI : <x,c> in EXTI(SI(rdf:type))} (c in CI)
> 
> This is not in the RDF Semantics document.

It is: see my quote above.

> 
> > These are all the definitions that need to be summarized.
> > It follows from the complete definition of RDFS interpretation
> > (actually, it follows already from the definition of RDF 
> > interpretation) that
> > (*) CI = CEXTI(SI(rdfs:Class))  and  PI = CEXTI(SI(rdf:Property)).
> > (The range that I give above to the function CEXTI does not
> > appear explicitly in the RDF Semantics document, but follows
> > clearly from what is said there.)
> 
> range -> domain ?

I really mean range here: For a c in CI, the value CEXTI(c)
recalled above is a subset of RI, so the range of CEXTI
can be taken to be P(RI).

> 
> > So each table in Section 5.2 needs to be expanded with an 
> > assumption
> > SI(E) in CI (in case CEXTI(SI(E)) is used) or 
> > SI(E) in PI (in case EXTI(SI(E)) is used).
> > 
> > In the second table of Section 5.2 this is easy: each of 
> > the empty cells in the second column can just be assigned 
> > the content CI.
> 
> Not needed.

It is really needed.  It is central to the purpose of this 
document to define, with mathematical precision, the semantics
of OWL in terms of that of RDFS.  If it is not ensured that the
mathematical functions used are only used inside their domains, 
then mathematical rigor is lost.

> 
> > In the later tables it is also possible to incorporate the 
> > required additional assumptions, in the bold header texts.
> 
> Not needed.

See previous comment.

> 
> > A simpler and more elegant way to incorporate these additional 
> > assumptions could be as follows.
> > The OWL vocabulary at the beginning of Section 5.1.1 
> > (where it now "appears" with an ellipsis) could be expanded
> > explicitly, using two disjoint subsets VOWLC and VOWLP (it is 
> > clear which vocabulary members should go where).  Then the 
> > required additional assumptions on an OWL interpretation
> > can be made in one stroke with 
> > SI(VOWLC) subsetOf CI   and SI(VOWLP) subsetOf PI.
> 
> I think that this is neither necessary nor desirable.

It is a way to simplify the needed introduction of the
additional assumptions.

> 
> > The equations (*) above can be used to simplify many entries
> > in the tables in Section 5.2, by taking CI or PI instead of 
> > the expansions in the right-hand sides of these equations.
> > It should be noted that CI and PI are more fundamental in the
> > RDF Semantics then these expansions.
> 
> CI actually is subordinate to ICEXT(I(rdfs:Class)).
> Given that CI is subordinate, I prefer to keep consistency by using 
ICEXT
> throughout. 

As I explained above, ICEXT is defined in terms of CI,
so CI is more fundamental than ICEXT and the
expression CEXTI(I(rdfs:Class)).

Using CI instead of CEXTI(SI(rdfs:Class))  and 
PI instead of CEXTI(SI(rdf:Property)), gives a nice
simplification of the tables.

> 
> > Also, the conditions for an OWL interpretation to be OWL Full
> > become simply IOT = RI, IOC = CI, IOP = PI.

This slightly simplifies the definition of OWL Full 
interpretations.

[...]

> 
> > In my view, the formal definition of an OWL interpretation should
> > include, in addition to an RDFS interpretation <RI,PI,EXTI,SI,LI>,
> > the distinguished subsets IOC, IOP, IOT, IOR, IOOP, IODP, IDC,
> > IAD, and IL of RI.  Otherwise, these sets "fall out of the air".
> > Each of these 9 subset relationships is implied by the second
> > table of Section 5.2, except for IAD subsetOf RI, which should
> > be added to this table.
> 
> I think that this is neither needed nor desirable.

You added a new sentence after the table, in the version of 17 
January, and this solves my problem here:

"The above table is the definition of several semantic sets, namely 
IOC, IOT, IOR, IOP, IOOP, IODP, IDC, IAD, and IL. That is, these 
are simply shorthand names for the appropriate class extension."

Now it is clear that these sets do not fall out of the air: they are
defined here.

[...]

> > The first table in Section 5.2 uses sets IOP, IOC etc. whose 
> > meaning is not yet clear.  Therefore I propose to move this table
> > to the third position.  Then, moreover, we get three more coherent 
> > "groups of tables" in a row:
> > 1. universe/syntactic categories; classes/datatypes/properties
> > 2. the "iff tables": domains/ranges; equivalence
> > 3. the "DL tables": Boolean combinations; restrictions; 
> >    comprehension principles
> 
> I don't think that this reordering is helpful.

In view of your addition just cited, is now even necessary
to move the first table.  The sets IOP, IOC, IDC appearing in 
the first table (the iff conditions for domains and ranges) 
are only defined in the next table (on universe and syntactic 
categories).
I propose to move the table on domains and ranges just before
the table on equivalences, since both work with iff conditions.

> 
> > I feel that Section 5.2 could use more text to motivate these
> > different kinds of tables.  For example, can it be 'explained' 
> > why is there an iff for owl:sameClassAs and owl:disjointWith
> > but not for owl:complementOf?
> 
> There is very little motivation now.  However, I think that a useful 
amount
> of motivation would turn out to be a large amount of motivation.  I've 
been
> too busy with RDF to attend to it.
> 
> 
> > In my view, the first condition on oneOf is an unsuitable integration
> > of dissimilar conditions.  In fact, the next table, which is
> > completely devoted to oneOf, could be omitted by slightly 
> > extending the condition in the previous table, as follows:
> >   ( x in IOC and l is a sequence of y1,...,yn over IOT
> >   or x in CI and l is a sequence of y1,...,yn over LV )
> >   and CEXTI(X) = {y1,...,yn}
> 
> The condition for oneOf is similar to the ones for unionOf and
> intersectionOf.  I thus think that it should stay.  The change you 
suggest
> would cause problems for OWL DL.
> 

[...]

> 
> > Section 5.1 starts with the following sentence:
> > "All of the OWL vocabulary is defined on the 'OWL universe', 
> > which is a collection of RDFS classes that are intended to 
> > circumscribe the domain of application of the OWL vocabulary: 
> > owl:Thing, owl:Class and owl:Property."
> > I read here that 
> > OWL universe = {owl:Thing, owl:Class and owl:Property}.
> > However, with the RDF semantics it is inherited that
> > the set RI is called the universe of the interpretation,
> > as is also mentioned in the beginning of Section 5.2.
> > As the word universe is used here in two different ways,
> > I feel that the wording of the cited sentence should be
> > adapted to incorporate the connection with RDF semantics.
> 
> Changed to
> 
> <p>All of the OWL vocabulary is defined on the 'OWL universe', which is 
a
>    division of part of the RDFS universe into three parts, namely the 
class
>    extensions of owl:Thing, owl:Class and owl:Property. 
>   The  class extension of owl:Thing comprises the individuals of the
>   OWL universe.  The  class extension of owl:Class comprises the
>   classes of the OWL universe.  The  class extension of
>   owl:Property comprises the properties of the OWL universe.
> </p>

OK.

> 
> > 
> > The table on the semantics of the cardinality restrictions 
> > does not yet include the corrections which I believe you 
> > confirmed earlier.
> > It should be, three times:
> >    card{v in IOT union LV : <u,v> in EXTI(p)}
> > (In our earlier discussion I missed the LV part.
> > In this way, both object properties and datatype properties
> > are covered, in the correct, intended way.)
> > Without this addition, formally, there is no set, so
> > no cardinality can be taken.  Instead, formally, there is 
> > only a class, not in the sense of OO or RDF or OWL, but 
> > in the sense of Zermelo-Fraenkel set theory.
> 
> I disagree.  I believe that the definitions are fine as they are.

I am surprised that you retract the confirmation that you
earlier made in [3].
Let me ask you a technical question:
How do you know ,in terms of axioms and/or theorems from set 
theory, that {v: <u,v> in EXTI(p)} is a set?
The fact that {v in IOT union LV : <u,v> in EXTI(p)} is a set
follows directly from one the first axioms of set theory, the
specification axiom: a set is given, and a predicate for
forming a subset.
Apart from this set-theoretic argument, I am in favor of
making completely explicit, in this definition of the
semantics of minCardinality, maxCardinality, and cardinality,
which elements should be counted.

> 
> > I find it confusing, in the definition of separated OWL
> > vocabulary in Section 5.3.2, to identify a vocabulary
> > with a partition of it.  I am in favor of omitting the 
> > = sign, and of speaking of a vocabulary V' with partition
> > <...>.  This would also affect (improve) the next paragraphs, 
> > including the statement of Theorem 1.
> 
> I should have used V = VI + VC + VD + VOP + VDP

I believe you should add that + stands for disjoint union, since
this is non-standard notation.

[...]

> 
> > In Sections 5.3.1/2, three RDF Graphs should become RDF graphs.
> 
> Technical terms can (and often should) be capitalized, I believe.
> 
You should be consistent in your choice: most often there is g, and
only three times G.  More importantly, technical terms specifying 
an individual item should indeed be capitalized, but generic technical 
terms such as RDF graph or abstract OWL ontology (recall the earlier 
part of my review) should not be capitalized. 
As to RDF graphs, moreover, let us be consistent with the 
RDF Semantics document, which speaks of RDF graphs.
[...]

> 
> > As to notation, I prefer the standard notation for empty set
> > instead of {} (this also appeared in earlier sections).
> 
> I prefer {}, as it uses the same notation as non-empty sets.

Since {} is non-standard, it would need explanation.  With
set theory as with RDF, I believe it is preferable to follow 
the standard.
> 
> peter
> 
> 

I need to make some further remarks about Section 5.

The definition of OWL interpretation does not yet have PI
added in the tuple I.

There is a problem with the use of the word D-interpretation
in the same definition.  The RDF Semantics speaks
of D-interpretations of RDF graphs,
not of vocabularies, so as to be able to refer to
the typed literals appearing in graphs.
So the AS & S document cannot speak of D-interpretations
of vocabularies.
I therefore believe that this definition should read as 
follows:
  An OWL interpretation I=<..> of a vocabulary V, where V
  includes ..., is an RDFS  interpretation of V that 
  satisfies all constraints in this section.
Before reacting, read also the next comment, where
D-interpretations return!

The definition of OWL DL interpretation of an RDF graph
(and analogously of OWL Full interpretation) 
should be slightly changed, to make it fit with the
RDF Semantics.  Proposal:
  Let K ... and V ... .  An OWL DL interpretation of K 
  is an OWL DL interpretation of V that is also a D 
  interpretation of K and that satisfies K.
Motivation: see the preceding comment, and note also that
that the RDF Semantics speaks of satisfaction of graphs
rather than interpretation, when dealing with the
truth of the statements in the graph.  I agree, however,
to make one shortcut with the definition of OWL
interpretation of a graph, in the way just indicated.


Herman ter Horst
Philips Research

[1] http://lists.w3.org/Archives/Public/www-rdf-comments/2002OctDec/0096.html
[2] http://www.w3.org/2001/sw/RDFCore/TR/WD-rdf-mt-20030117/
[3] http://lists.w3.org/Archives/Public/www-webont-wg/2002Dec/0285.html
Received on Monday, 20 January 2003 09:34:26 UTC