paper on recursive constraints

There is an interesting paper on recursive constraints to be presented at
ISWC this year.  The paper is "Semantics and Validation of Recursive SHACL"
by Julien Corman, Juan L. Reutter, and Ognjen Savković.  There is an
extended version of the paper available at
https://www.inf.unibz.it/krdb/KRDB%20files/tech-reports/KRDB18-01.pdf

The conclusion to take away from this paper is that there are multiple ways
of defining the meaning of recursive constraints with very different
behaviours.  If a standard has recursive constraints it needs to choose
which meaning it is going to give to them.


The paper presents a semantics for SHACL that is based on three values:
true, false, and a third value whose intuitive reading should probably be
either unknown or neither.  With this third value means the semantics does
not collapse on any set of shapes, even when there is negative recursion
over loops of odd length.

The semantics does not push looping shapes to their maximal extension nor
does it push looping shapes to their minimal extension, instead letting
their extension be freely chosen among those models that satisfy the
constraints.  In this way it is quite similar to the semantics of
description logics, and quite different from both the semantics of ShEx and
the behaviour of SHACL implementations that assume success when they
encounter a shape loop.

In this semantics a set of shapes is valid if there is some assignment of
nodes to sets of shapes such that the constraints on shapes are not
violated.  The semantics does not need stratification of negation.

So in this semantics the following graph
  ex:start ex:p ex:v .
is valid for the following pair of shapes
  ex:start  :  =1. ex:p ex:loop
  ex:loop   =  ex:loop
but also for this pair of shapes
  ex:start  :  =0. ex:p ex:loop
  ex:loop   =  ex:loop
The first set of shapes is validated by an assignment that has ex:v in
ex:loop.  The second set of shapes is validated by an assignment that has
ex:v not in ex:loop.  In both these assignments no shape constraint is
violated.

However, the graph above is not valid for
  ex:start  :  =1. ex:p ex:loop
  ex:start  :  =0. ex:p ex:loop
  ex:loop   =  ex:loop
because there is no assignment that violates none of the constraints.


Determining validity in this semantics is co-NP complete in the size of the
graph even if there is no negation in loops.  This can be easily seen by
creating shapes that encode the evaluation rules for propositional logic and
having the data encode propositional formulae.  The trick is to have
variables be loops as above so that the shapes are valid if there is an
assignment to the variables that makes the formula evaluate to true.

The following shapes that encode the evaluation rules for propositional logic:

  b:vtrue  =  =1 . b:var b:vtrue
  b:ntrue  =  =1 . b:negated (~ b.true)
  b:dtrue  =  >=1 . b:disjunct b.true
  b:ctrue  =  >=1 . b.conjunct top  &  =0 . b.conjunct (~ b.true)
  b:true   =  b:ctrue | b:dtrue | b:ntrue | b:vtrue
  b:start  :  b:true

To encode a formula just use nodes with no outgoing edges for the variables,
nodes for the sub-formulae with outgoing edges with the appropriate labels
to their sub-formulae using b:start for the main formula.

So  ( v & ~x ) | ( w & ~ ( y | z ) )
becomes
  b:start b.disjunct _:d1 .
  _;d1 b.conjunct :v .
  _:d1 b.conjunct _:c12 .
  _:c12 b.negated :x .
  b:start b.disjunct _:d2 .
  _:d2 b.conjunct w .
  _:d2 b.conjunct _:c22 .
  _:c22 b.negated _:n221 .
  _:n221 b.disjunction :y .
  _:n221 b.disjunction :z .
  :v b:var :v .
  :w b:var :w .
  :x b:var :x .
  :y b:var :y .
  :z b:var :z .


In the above, the third truth value does not play a significant part.  The
third truth value is only really so that negated odd-length loops don't
cause the  semantics to collapse.   This allows the shapes
  l:start  :  >=1 . l:p  l:nloop .
  l:nloop  =  =1 l:p ( ~ l:nloop ) .
to validate against
  l:start l:p _:loops , _:loopd .
  _:loops l:p _:loops .
  _:loopd l:p _:loopd2 .
  _:loopd2 l:p _:loopd .
because _:loops can be given the third truth value for l:nloop.



Because the third truth value is less informative than the other two truch
values it is possible to come up with an approximate validation process that
is sound but not complete and runs in polynomial time.   The process starts
out by computing an approximation that assigns the third truth value where
it would normally have to make a guess, i.e., for assignments that depend on
some loop.  It then makes guesses, but only up to a pre-assigned depths.
The process runs time polynomial, but the polynomial is in the size of the
graph to the pre-assigned depth, so useful values for the pre-assigned depth
are 0 and 1.



peter

Received on Monday, 18 June 2018 15:45:08 UTC