- From: Peter F. Patel-Schneider <pfpschneider@gmail.com>
- Date: Fri, 10 Jun 2016 06:21:22 -0700
- To: RDF Data Shapes Working Group <public-data-shapes-wg@w3.org>
Pre-binding and sh:hasShape form a large part of the meaning of SHACL. They are not just part of the extension mechanism in SHACL but are used in the definition of the core of SHACL. In Section 1.5 there is This specification uses parts of SPARQL 1.1 in the normative definition of the semantics of the SHACL Core constraints and scopes. SPARQL variables using $ marker represent external values that must be pre-bound in the SPARQL query before execution. Some SHACL constraints are defined with the use of the sh:hasShape function. In Section 4 there is The SPARQL definitions in this section also assume the existence of a built-in SPARQL function sh:hasShape. Then pre-binding shows up in the normative definition of every core constraint component and sh:hasShape shows up the normative definitions of sh:not, sh:and, sh:or, sh:shape, and sh:qualifiedValueShape. It is possible to implement the core of SHACL without using sh:hasShape and pre-binding but this implementation will be implementing something that is defined in large part by sh:hasShape and pre-binding. In the extension part of SHACL, sh:hasShape and pre-binding are used directly when writing the SPARQL code that implement templates. Problems with sh:hasShape and pre-binding thus are not just problems with an underlying definition of SHACL but also directly affect the meaning of constructs that are employed by users of SHACL. It is possible to have a SPARQL-based extension mechanism for SHACL that does not use sh:hasShape and does not use pre-binding. Thus neither sh:hasShape nor pre-binding is needed for SHACL. sh:hasShape is currently defined in Appendix A of the SHACL specification, http://w3c.github.io/data-shapes/shacl/#hasShape. sh:hasShape currently produces three results: undefined recursion is encountered, true if no violation validation result is produced, and false if some violation result is produced. This desription of sh:hasShape has several problems. First, it is unclear as to which validation results count in the description. Is it only result from the direct validation of the focus node or do results from embedded shapes count? Second, the three possibilities are not disjoint. Third, recursion is not possible in SHACL so the undefined result can never occur. However, the biggest problem with sh:hasShape is that it depends on pre-binding. sh:hasShape has to evaluate SPARQL queries in a context where several query variables are limited to certain values. This is an innate peculiarity of using a SPARQL function that in turn initiates further SPARQL query processing so problems in pre-binding are problems for sh:hasShape. Pre-binding of variables in SHACL is currently defined in Appenix B of the SHACL specification, http://w3c.github.io/data-shapes/shacl/#pre-binding. Pre-binding is defined, in full, as Pre-binding a variable with a value means that the SPARQL processor needs to evaluate all occurrences of variables with that same name (including occurrences in inner scopes and nested SELECT queries) so that they have the provided value. In other words, whenever a SPARQL processor evaluates a pre-bound variable, it must use the given value. This definition does not align with the definition of SPARQL at all. SPARQL is a query language and often does not evaluate query variables. In particular, SPARQL does not evaluate query variables in basic graph patterns. The definition of basic graph pattern matching in SPARQL, from https://www.w3.org/TR/sparql11-query/#BasicGraphPattern, is Let BGP be a basic graph pattern and let G be an RDF graph. μ is a solution for BGP from G when there is a pattern instance mapping P such that P(BGP) is a subgraph of G and μ is the restriction of P to the query variables in BGP. Note that there is no notion of evaluation here at all. Using evaluation as the basis of the definition of pre-binding is thus disconnected from a large part of the behaviour of SPARQL. This disconnect shows up in even the simplest of SPARQL queries that implement constraint components. Consider the normative SPARQL definition of sh:class in property constraints SELECT $this ($this AS ?subject) $predicate (?value AS ?object) WHERE { $this $predicate ?value . FILTER NOT EXISTS { ?value rdf:type/rdfs:subClassOf* $class } . } The pre-binding of $this and $predicate does not affect meaning of the basic graph pattern $this $predicate ?value . so that, according to the definitionf of SHACL and SPARQL, the solution sequence generated from matching this basic graph pattern will have solutions for each triple in the data graph. This is already a total failure but what happens next? Well the filter is used to remove some of the solutions, using the SPARQL semantic Filter function. Each solution is checked to see whether the filter evaluates to true for that solution. Because the filter expression is an EXISTS expression it uses the SPARQL substitute function, which for each query variable in ?value rdf:type/rdfs:subClassOf* $class replacces it by its mapping in the solution, if any. There is a solution for each triple in the data graph this will result in that many substitutions. Next each of these substitutions is separately matched against the data graph. This matching will have a result for values of $this that are the subject of an rdf:type triple and then these solutions are filtered out. So the end result will have a solution for every triple in the data graph where the subject of the triple is not the subject of an rdf:type triple. Of course this is completely not what the result should be. However, it is what the current definition of SHACL says the result is. Some SPARQL expert is going to have to take a close look at pre-binding to determine what its definition should be. However, before that there needs to be a closer look taken at how pre-binding should operate. For example, should prebinding affect variables throughout the query or only variables that would be affected by a BIND construct at the beginning of the query? There should be some examples generated to show how pre-binding works under these two options so that the working group can make an informed decision. Peter F. Patel-Schneider Nuance Communications
Received on Friday, 10 June 2016 13:21:51 UTC