- From: Peter F. Patel-Schneider <pfpschneider@gmail.com>
- Date: Mon, 18 Jun 2018 08:44:40 -0700
- To: public-shacl@w3.org
There is an interesting paper on recursive constraints to be presented at ISWC this year. The paper is "Semantics and Validation of Recursive SHACL" by Julien Corman, Juan L. Reutter, and Ognjen Savković. There is an extended version of the paper available at https://www.inf.unibz.it/krdb/KRDB%20files/tech-reports/KRDB18-01.pdf The conclusion to take away from this paper is that there are multiple ways of defining the meaning of recursive constraints with very different behaviours. If a standard has recursive constraints it needs to choose which meaning it is going to give to them. The paper presents a semantics for SHACL that is based on three values: true, false, and a third value whose intuitive reading should probably be either unknown or neither. With this third value means the semantics does not collapse on any set of shapes, even when there is negative recursion over loops of odd length. The semantics does not push looping shapes to their maximal extension nor does it push looping shapes to their minimal extension, instead letting their extension be freely chosen among those models that satisfy the constraints. In this way it is quite similar to the semantics of description logics, and quite different from both the semantics of ShEx and the behaviour of SHACL implementations that assume success when they encounter a shape loop. In this semantics a set of shapes is valid if there is some assignment of nodes to sets of shapes such that the constraints on shapes are not violated. The semantics does not need stratification of negation. So in this semantics the following graph ex:start ex:p ex:v . is valid for the following pair of shapes ex:start : =1. ex:p ex:loop ex:loop = ex:loop but also for this pair of shapes ex:start : =0. ex:p ex:loop ex:loop = ex:loop The first set of shapes is validated by an assignment that has ex:v in ex:loop. The second set of shapes is validated by an assignment that has ex:v not in ex:loop. In both these assignments no shape constraint is violated. However, the graph above is not valid for ex:start : =1. ex:p ex:loop ex:start : =0. ex:p ex:loop ex:loop = ex:loop because there is no assignment that violates none of the constraints. Determining validity in this semantics is co-NP complete in the size of the graph even if there is no negation in loops. This can be easily seen by creating shapes that encode the evaluation rules for propositional logic and having the data encode propositional formulae. The trick is to have variables be loops as above so that the shapes are valid if there is an assignment to the variables that makes the formula evaluate to true. The following shapes that encode the evaluation rules for propositional logic: b:vtrue = =1 . b:var b:vtrue b:ntrue = =1 . b:negated (~ b.true) b:dtrue = >=1 . b:disjunct b.true b:ctrue = >=1 . b.conjunct top & =0 . b.conjunct (~ b.true) b:true = b:ctrue | b:dtrue | b:ntrue | b:vtrue b:start : b:true To encode a formula just use nodes with no outgoing edges for the variables, nodes for the sub-formulae with outgoing edges with the appropriate labels to their sub-formulae using b:start for the main formula. So ( v & ~x ) | ( w & ~ ( y | z ) ) becomes b:start b.disjunct _:d1 . _;d1 b.conjunct :v . _:d1 b.conjunct _:c12 . _:c12 b.negated :x . b:start b.disjunct _:d2 . _:d2 b.conjunct w . _:d2 b.conjunct _:c22 . _:c22 b.negated _:n221 . _:n221 b.disjunction :y . _:n221 b.disjunction :z . :v b:var :v . :w b:var :w . :x b:var :x . :y b:var :y . :z b:var :z . In the above, the third truth value does not play a significant part. The third truth value is only really so that negated odd-length loops don't cause the semantics to collapse. This allows the shapes l:start : >=1 . l:p l:nloop . l:nloop = =1 l:p ( ~ l:nloop ) . to validate against l:start l:p _:loops , _:loopd . _:loops l:p _:loops . _:loopd l:p _:loopd2 . _:loopd2 l:p _:loopd . because _:loops can be given the third truth value for l:nloop. Because the third truth value is less informative than the other two truch values it is possible to come up with an approximate validation process that is sound but not complete and runs in polynomial time. The process starts out by computing an approximation that assigns the third truth value where it would normally have to make a guess, i.e., for assignments that depend on some loop. It then makes guesses, but only up to a pre-assigned depths. The process runs time polynomial, but the polynomial is in the size of the graph to the pre-assigned depth, so useful values for the pre-assigned depth are 0 and 1. peter
Received on Monday, 18 June 2018 15:45:08 UTC