what is inferencing? from Peter F. Patel-Schneider on 2016-05-10 (public-data-shapes-wg@w3.org from May 2016)

From: Peter F. Patel-Schneider <pfpschneider@gmail.com>
Date: Mon, 9 May 2016 18:24:57 -0700
To: RDF Data Shapes Working Group <public-data-shapes-wg@w3.org>
Message-ID: <d6be9ff8-54d9-4685-c177-1ba7295fb620@gmail.com>

There has been quite a bit of discussion of inferencing and SHACL recently.
However, there has not been much discussion on the actual relationship between
SHACL and inferencing.

It turns out that SHACL depends entirely on inferencing.  The definition of
sh:nodeKind, probably the simplest construct in SHACL, says "A validation
result must be produced for each value node that does not match the given node
kind."  But this is inferencing!  True, a very degenerate form of inferencing
but inferencing none the less.

OK, let's rule out degenerate forms of inferencing, i.e., where all that is
required is looking at the form of an RDF term.

But then there is sh:hasValue.  Its definition says "A validation result must
be produced if the sh:hasValue is not among the value nodes."  This is again
inferencing and not so degenerate as the inferencing required for sh:nodeKind.
 This construct succeeds or fails depending on whether a specfic triple is
present in an RDF graph.   This is inferencing.  True, trivial inferencing but
inferencing none the less.

OK, let's rule out trivial forms of inferencing , i.e., where all that is
required is determining the presence of a single triple in an RDF graph.

But then there is sh:minCount.  Its definition says "A validation result must
be produced if the number of value nodes is less than the value of
sh:minCount."  This again inferencing and not inferencing that can be done by
only looking for a single RDF triple.  Here the number of triples that satisfy
a particular criterion has to be determined.  This is actually quite
sophisticated inferencing indeed.

It thus seems that SHACL needs sophisticated inferencing.  Nonetheless let's
press on and rule out even the sort of inteferencing where triples have to be
retrieved from an RDF graph until a set number of matching triples are obtained.

But then there is sh:class.  The definition of sh:class says "A validation
result must be produced for each value node that is either a literal or a
non-literal without a matching rdf:type. A non-literal matches a type if it
has an rdf:type value that is the given type or one of its (transitive)
subclasses, via rdfs:subClassOf."   So sh:class depends on the determination
of transitive closure, again a quite sophisticated kind of inference.

So what does it mean that SHACL doesn't use inferencing?  As far as I can see
the only true statement about SHACL and (non-)inferencing is that everything
in SHACL can be implemented by a simple translation from a SHACL shape to a
SPARQL query.  (This dividing line isn't even supported by the SHACL document
as the translation there is not to SPARQL but some very significantly extended
SPARQL.)   And even this isn't true if recursion is added to SHACL.

It certainly isn't the case that the dividing line is the need to reason by
cases (or, equivalently, to entertain multiple models).   RDFS inferencing
does not need reasoning by cases and SHACL does not need all of RDFS inferencing.

If there is a dividing line, it may be that SHACL inferencing is not
inherently serial.  However, I'm not at all sure that SHACL is indeed not
inherently serial, even with numbers written in unary and without recursion.


The net result is that SHACL depends on inferencing, and even quite
sophisticated inferencing at that.

peter

Received on Tuesday, 10 May 2016 01:25:36 UTC