- From: Peter F. Patel-Schneider <pfpschneider@gmail.com>
- Date: Tue, 22 Jul 2014 03:12:10 -0700
- To: "public-rdf-shapes@w3.org" <public-rdf-shapes@w3.org>
Using OWL for RDF constraint checking and closed-world recognition OWL descriptions and the OWL semantics provide the necessary framework for both validating constraints and providing recognition facilities, and thus cover what ShEx is trying to do and more. Why then are there claims that OWL is inadequate for these purposes? I do not know why, but there are several aspects of OWL that might not be consonant with constraints and the kind of recognition that might be desired. However, it turns out that both OWL syntax and semantics are arguably the right solution for RDF constraint checking and closed-world recognition. Closed-World Recognition Let's first look at recognition. Recognition is the basic operation in ShEx - we want to determine whether a particular node in an RDF graph matches a ShEx shape, for example, to say that John in <John> foaf:name "John"^^xsd:string . <John> foaf:phone "+19085551212"^^xsd:string . <John> ex:child <Bill> . <John> ex:child <William> . matches { foaf:name xsd:string , foaf:phone xsd:string, ex:child [2] } Recognition is also a basic operation of OWL. Determining whether an individual belongs to an OWL concept is recognition. The OWL version of the above ShEx expression (using a version of the DL publication syntax) is =1 foaf:name & all foaf:name xsd:string & =1 foaf:phone & all foaf:phone xsd:string & =2 ex:child So it seems that OWL can easily handle ShEx recognition. However, John does not match the above OWL description. Why is this? It is precisely that OWL does not assume that absence of information is information about absence. RDF works under the same assumptions, by the way. John could have more than one name as far as the above information is concerned. OWL (and RDF too) also does not assume that different names refer to different individuals. Bill and William could be the same person. OWL has facilities to explicitly state information about absence and information about differences. If we add <John> in <=1 foaf:name . <John> in <=1 foaf:phone . <John> in all child {<Bill>, <William>} . <Bill> /= <William> . then John does match the above description. So it is not that OWL does not perform the kind of recognition that underlies ShEx, it is just that OWL does not make the assumption that absence of information is information about absence. However, suppose that we want to make this assumption. This is roughly equivalent to saying that a system assumes that if it can't determine some fact, then that fact is false. There is a very large body of work on this topic because there are many tricky questions that arise with respect to closure in any sophisticated formalism, and OWL is indeed sophisticated. Fortunately RDF and RDFS are not very sophisticated at all, and the tricky questions just do not arise if information comes in the form of RDF and RDFS triples. The basic idea is to treat the triples (and their RDF and RDFS consequences if desired) as completely describing the world. So, 1/ if a triple is not present then it is false and 2/ different IRIs describe different individuals. This is precisely the same idea that underlies model checking. First-order inference is undecidable, but determining whether a first-order sentence is true in one particular state of the world is much, much easier. So it is possible to use the OWL syntactic and semantic machinery to define how to recognize OWL descriptions under just the same assumptions that underlie ShEx. The only change from the standard OWL setup is to define how to go from an RDF graph to an OWL model. (There are some technical details that interfere with this general description of the account, but they are easy to handle.) Definitions, even recursive definitions, can be handled with only minor extensions to the framework. This is all quite easy and conforms to a common thread of both theoretical and practical work. It also matches how StarDog ICV works (as the theoretical underpinning of StarDog ICV is one of these theoretical results). Further, the approach can be implemented by translation into SPARQL queries, showing that it is practical. (There may be some constructs of OWL that do not translate into SPARQL queries when working with complete information, but at least the parts of OWL that correspond to the usual recognition conditions do so translate.) Constraint Validation Constraint validation does not appear to be part of the services provided by OWL. This has lead to claims that OWL cannot be used for constraint validation. However inference, which is the core service provided by OWL, and constraint validation are indeed very closely related. Inference is the process of determining what follows from what has been stated. Inference ranges from simple (students are people, John is a student, therefore John is a person) to the very complex. Inference can also recognize impossibilities (students are people, John is a student, John is not a person, therfore there is a contradiction). In the presence of complete information, nothing new can be inferred, so inference only checks for impossibilities, i.e., constraint violations. So the way do constraints in OWL is to first set up complete information, and then just perform inference. For example, with the concept axiom ex:Person <= =1 foaf:name & all foaf:name xsd:string & =1 foaf:phone & all foaf:phone xsd:string & =2 ex:child and in the presence of (locally) complete information, such as <John> in ex:Person . <John> foaf:name "John"^^xsd:string . <John> foaf:phone "+19085551212"^^xsd:string . <John> ex:child <Bill> . <John> ex:child <William> . <John> in <=1 foaf:name . <John> in <=1 foaf:phone . <John> in all child {<Bill>, <William>, <Susan>} . <Bill> /= <William> . <Bill> /= <Susan> . <Susan> /= <John> . determining whether <John> belongs to ex:Person is just constraint validation. Setting up complete information is just what was done above. In a model there is complete information, so considering an RDF graph (plus consequences) as a model turns OWL inference into constraint validation. Of course, this doesn't mean that you have to implement OWL inference in a model the same way that you need to with incomplete information. In fact, as above, constraint validation can be implemented as SPARQL queries. Conclusion In fact the main difference between recognition and constraint checking is that the former either has no axioms or only uses axioms defining names that do not occur in the RDF graph whereas constraint checking uses axioms that relate concepts appearing in the RDF graph to descriptions. So OWL can indeed be used for both the syntax and semantics of constraint checking and closed-world recognition in RDF, and most or all of it can be implemented using a translation to SPARQL queries.
Received on Tuesday, 22 July 2014 10:12:40 UTC