RE: completeness from Michael Schneider on 2008-02-21 (public-owl-wg@w3.org from February 2008)

From: Michael Schneider <schneid@fzi.de>
Date: Thu, 21 Feb 2008 18:17:43 +0100
To: "Ulrike Sattler" <sattler@cs.man.ac.uk>
Cc: <public-owl-wg@w3.org>
Message-ID: <0EF30CAA69519C4CB91D01481AEA06A0750FD8@judith.fzi.de>
Hi Uli!

Since there is no final agreement at the moment on what will eventually become OWL-Prime, I suggest to compare the situation with a language which will most probably become a sublanguage of OWL-Prime (given that OWL-Prime will become an OWL-Full fragment): RDFS.

  (1) The vocabulary of RDFS is clearly defined.

  (2) There is a proper model-theoretic semantics for RDFS [10].

  (3) There is a set of triple rules [20], which is sound and complete w.r.t. to the set triples entailible by the model-theoretic semantics given in [10]. (There is a minor exception to this statement having to do with the fact that RDF(S)'s syntax does not allow bNodes in predicate position of an RDF triple, see section 1.5 in [30].)

  (4) Jena implements an RDFS reasoner [40] based on the triple rule set given in [20]. In fact, this implementation consists of three different "completeness levels" [50] (that's my term actually), where each of these implementations is incomplete w.r.t. the whole set of triple rules specified in [20]. The "level of incompleteness" is clear for each of the implementations, since it is defined in terms of a subset of triple rules used for the respective reasoner. (Ok, perhaps the used rules should have been explicitly itemized for each implementation, instead of being informally described by natural language text. But, luckily, Jena allows me to check myself which rules are actually used by just looking inside the rule files for the respective reasoner. :))

I would say that the incompleteness criterion of Jena's RDFS reasoners is fair: The "incompleteness level" of an implementation is simply defined in terms of a subset of rules of the full language (here: RDFS). Such a criterion allows at least two kinds of comparisons:

  * Intra-implementation comparison: The performance of two different completeness levels within the same reasoner suite can be compared. For example, the performance of the "Default" implementation of JenaRDFS can be compared to the "Simple" implementation, when the same data set is used.

  * Inter-implementation comparison: The performance of two different implementations providing the same completeness level can be compared (provided, of course, that the implementors publish the incompleteness level).

Currently, there is not yet any agreement on the vocabulary of OWL-Prime. This would be item (1) above. Let's say, the current proposal [60] is used in the end. Then, from what I can see from [30] (DANGER: I did not yet read it completely, so what I say here might be largely wrong!), it will be possible to have a subset of OWL-Full model-theoretic semantic conditions (item (2)), for which there is a sound and complete set of triple rules (item (3)). Let me cite from the abstract of [30]:

    "We define semantic extensions of RDFS that involve 
    [...] a subset of the OWL vocabulary that includes 
    the property-related vocabulary (e.g. FunctionalProperty), 
    the comparisons (e.g. sameAs and differentFrom) 
    [...] 
    For these semantic extensions we present entailment rules, 
    prove completeness results, [...]"

To be more concrete, here is the actual pD* vocabulary (RDFS vocabulary ommitted) from sec. 5.1 in [30]:

    owl:sameAs 
    owl:differentFrom 

    owl:equivalentClass
    owl:disjointWith

    owl:equivalentProperty
    owl:SymmetricProperty
    owl:TransitiveProperty
    owl:inverseOf
    owl:FunctionalProperty
    owl:InverseFunctionalProperty

    owl:Restriction
    owl:onProperty
    owl:someValuesFrom
    owl:allValuesFrom
    owl:hasValue 

As you can see, this vocabulary is a superset of the current OWL-Prime proposal. Actually, OWL-Prime does not support any form of owl:Restrictions, and it has not even differentFrom nor disjointWith included.

For the completeness results announced in the abstract above, in order to be able to estimate them I need to first have a deeper look into the chosen subset of OWL-Full model-theoretic semantics and the chosen set of triple rules. So I cannot tell any more on this topic at the moment (others might jump in here!). 

Anyway, it looks to me that, as long as OWL-Prime will be in the scope of pD* (both regarding vocabulary and model-theoretic semantics), there is a good chance to receive a reasonable language based on a set of triple rules. Then, implementors can decide to define subsets of it, and do both intra- and inter-implementation comparisons. In particular, they can do scalability comparisons.

Cheers,
Michael

[10] <http://www.w3.org/TR/rdf-mt/#rdfs_interp>
[20] <http://www.w3.org/TR/rdf-mt/#rules>
[30] <http://linkinghub.elsevier.com/retrieve/pii/S1570826805000144>
[40] <http://jena.sourceforge.net/inference/index.html#rdfs>
[50] <http://jena.sourceforge.net/inference/index.html#RDFSconfiguration>
[60] <http://lists.w3.org/Archives/Public/public-owl-wg/2008Jan/att-0308/00-part>

>-----Original Message-----
>From: public-owl-wg-request@w3.org 
>[mailto:public-owl-wg-request@w3.org] On Behalf Of Ulrike Sattler
>Sent: Wednesday, February 20, 2008 8:54 PM
>To: public-owl-wg@w3.org
>Subject: completeness
>
>
>Hi,
>
>I thought I'd share some of my thoughts regarding completeness,  
>scalability, & interoperability that might explain why I keep  
>shouting "what do you mean by *scalable*"?
>
>1) Even I can write a very scalable OWL DL query answering engine if  
>it doesn't have to be complete: when asked to retrieve instances of a  
>class C, it simply always only returns nothing...wait, I can even do  
>better by returning "told" instances of C!
>
>2) If we agree that (1) is sort of cheating, then we need to be more  
>precise what "completeness" means: now, if we say "my engine will  
>retrieve as many instances of C as it can manage in the given time",  
>then we might get more than the told instances, but we could be  in  
>trouble regarding interoperability: your engine could return a very  
>different answer set from mine, since they have different strengths  
>or optimisation techniques or,e.g., rule orderings.
>
>So, what I would like to see as a clarification of "we can trade a  
>bit of completeness for some scalability" is a description what  
>*kind* of completeness we give up for (ideally) how much gain in  
>performance.
>
>Cheers, Uli

--
Dipl.-Inform. Michael Schneider
FZI Forschungszentrum Informatik Karlsruhe
Abtl. Information Process Engineering (IPE)
Tel  : +49-721-9654-726
Fax  : +49-721-9654-727
Email: Michael.Schneider@fzi.de
Web  : http://www.fzi.de/ipe/eng/mitarbeiter.php?id=555

FZI Forschungszentrum Informatik an der Universität Karlsruhe
Haid-und-Neu-Str. 10-14, D-76131 Karlsruhe
Tel.: +49-721-9654-0, Fax: +49-721-9654-959
Stiftung des bürgerlichen Rechts
Az: 14-0563.1 Regierungspräsidium Karlsruhe
Vorstand: Rüdiger Dillmann, Michael Flor, Jivka Ovtcharova, Rudi Studer
Vorsitzender des Kuratoriums: Ministerialdirigent Günther Leßnerkraus
Received on Thursday, 21 February 2008 17:18:03 UTC