- From: Peter F. Patel-Schneider <pfpschneider@gmail.com>
- Date: Mon, 9 May 2016 18:24:57 -0700
- To: RDF Data Shapes Working Group <public-data-shapes-wg@w3.org>
There has been quite a bit of discussion of inferencing and SHACL recently. However, there has not been much discussion on the actual relationship between SHACL and inferencing. It turns out that SHACL depends entirely on inferencing. The definition of sh:nodeKind, probably the simplest construct in SHACL, says "A validation result must be produced for each value node that does not match the given node kind." But this is inferencing! True, a very degenerate form of inferencing but inferencing none the less. OK, let's rule out degenerate forms of inferencing, i.e., where all that is required is looking at the form of an RDF term. But then there is sh:hasValue. Its definition says "A validation result must be produced if the sh:hasValue is not among the value nodes." This is again inferencing and not so degenerate as the inferencing required for sh:nodeKind. This construct succeeds or fails depending on whether a specfic triple is present in an RDF graph. This is inferencing. True, trivial inferencing but inferencing none the less. OK, let's rule out trivial forms of inferencing , i.e., where all that is required is determining the presence of a single triple in an RDF graph. But then there is sh:minCount. Its definition says "A validation result must be produced if the number of value nodes is less than the value of sh:minCount." This again inferencing and not inferencing that can be done by only looking for a single RDF triple. Here the number of triples that satisfy a particular criterion has to be determined. This is actually quite sophisticated inferencing indeed. It thus seems that SHACL needs sophisticated inferencing. Nonetheless let's press on and rule out even the sort of inteferencing where triples have to be retrieved from an RDF graph until a set number of matching triples are obtained. But then there is sh:class. The definition of sh:class says "A validation result must be produced for each value node that is either a literal or a non-literal without a matching rdf:type. A non-literal matches a type if it has an rdf:type value that is the given type or one of its (transitive) subclasses, via rdfs:subClassOf." So sh:class depends on the determination of transitive closure, again a quite sophisticated kind of inference. So what does it mean that SHACL doesn't use inferencing? As far as I can see the only true statement about SHACL and (non-)inferencing is that everything in SHACL can be implemented by a simple translation from a SHACL shape to a SPARQL query. (This dividing line isn't even supported by the SHACL document as the translation there is not to SPARQL but some very significantly extended SPARQL.) And even this isn't true if recursion is added to SHACL. It certainly isn't the case that the dividing line is the need to reason by cases (or, equivalently, to entertain multiple models). RDFS inferencing does not need reasoning by cases and SHACL does not need all of RDFS inferencing. If there is a dividing line, it may be that SHACL inferencing is not inherently serial. However, I'm not at all sure that SHACL is indeed not inherently serial, even with numbers written in unary and without recursion. The net result is that SHACL depends on inferencing, and even quite sophisticated inferencing at that. peter
Received on Tuesday, 10 May 2016 01:25:36 UTC