- From: Peter F. Patel-Schneider <pfpschneider@gmail.com>
- Date: Thu, 22 Sep 2016 18:36:42 -0700
- To: Holger Knublauch <holger@topquadrant.com>, public-rdf-shapes@w3.org
Responses in line. peter > pre-binding > > SPARQL does not evaluate variables that occur in basic graph patterns. This means that the definition of pre-binding has unusual behaviour. For example, the normative SPARQL definition of sh:class will return validation results for every pair of nodes in the graph such that there is an rdf:type/rdfs:subClass* path from the first to the second. > > This problem affects many parts of the definition of SHACL. It means that the normative definition of many SHACL constructs is counter to intuitions. This problem is not ameliorated by the caution box in Appendix B. > > Comment (HK): WG is waiting for input from the SPARQL EXISTS CG on this topic. The current definition of pre-binding in the 22 September 2016 Editors' Draft is broken. Following the description of pre-binding in http://w3c.github.io/data-shapes/shacl/#pre-binding results in something that does not serve any useful purpose. The working group needs to address this problem and cannot count on the SPARQL Maintentance (EXISTS) Community Group producing anything relevant as pre-binding is not part of their charter. > syntax of SPARQL variables > > SPARQL treats $ and ? as equivalent so $PATH and ?PATH both refer to the PATH variable. SHACL uses $ as a special marker and includes $ and ? as part of the variable. > > Would ?PATH be substituted as $PATH is? If a SPARQL query for a SHACL constraint only used ?this would the variable this be pre-bound? > > Comment (HK): I have tried to address this here (https://github.com/w3c/data-shapes/commit/4871ced946aa03cd2bd91d808d8e4a1b33e64ef6) so that the text no longer refers to things like $PATH as a variable, but instead to PATH. It looks as if the commit has largely addressed this editorial issue. I have not checked that all vestiges of the problem have been eliminated. > pre-binding optional? > > "SPARQL variables using the $ marker represent external values that must be pre-bound or substituted in the SPARQL query before execution." "When SPARQL constraints are executed, the validation engine should pre-bind values for these variables." Are some $-marked variables not necessarily pre-bound, counter to the earlier requirement? > > Comment (HK): The "should" was indeed a mistake, it's not optional. Removed: https://github.com/w3c/data-shapes/commit/ecdad602d5d4bfeb3a2a876298349fe69d0c4e60 The commit has addressed this editorial issue. > $PATH vs other $-prefixed variables > > The variable PATH is treated specially in SHACL. However, the general description of $ does not specially call out PATH: "SPARQL variables using the $ marker represent external values that must be pre-bound or substituted in the SPARQL query before execution." > > Comment (HK): Addressed here, pointing out the special treatment of PATH: https://github.com/w3c/data-shapes/commit/a5db1204433b19a0da099a8a89af76186d865f6c The commit introduces the distinction. The current version of Appendix C describes the special treatment of PATH. > $value > > $value is used in many ASK queries. However the definition of ASK validators does not appear to pre-bind value. > > Comment (HK): 4.1 states "These queries are interpreted against each value node, bound to the variable value." A similar statement exists in section 6.4.2. So I am not sure what is missing here. Nothing. My mistake. > aggregation > > The prohibition "Furthermore, any query that uses the variable $this in an aggregation is invalid." is vague. It appears to disallow the use of $this in any part of the SPARQL 1.1 aggregation machinery, as the pointer in the sentence is to Section 11 of the SPARQL specification. This would rule out all of the examples of aggregation in the SHACL document. > > Comment (HK): I have tried to clarify that this is only about the use of ?this in expressions. This is allowing its use in GROUP BY, in case you were referring to this. Apart from that I don't see uses of ?this in aggregations in the SHACL document. https://github.com/w3c/data-shapes/commit/0c6939ba95ffd6c7fee2285a3638c144a97f8528 GROUP BY is part of aggregation. There were four examples of GROUP BY ?this which the mentioned wording appears to prohibit. The current wording is no better. "[T]he expression used in an aggregation" is an incorrect description as there can be multiple expressions in the aggregation portion of a query. The argument to GROUP BY itself is just as much an expression as the argument to HAVING. This situation is indicative of the sloppiness of the current specification. SPARQL has a complicated grammar. The argument to GROUP BY is a sequence of GroupCondition; the argument to HAVING is a sequence of HavingCondition. Using general words, like "expression", to describe bits of the SPARQL grammar is generally incorrect. Where specific bits of SPARQL are used in the SHACL definition they need to be described as they are in the SPARQL definition. > ASK validators syntax > > The syntax for ASK queries in SPARQL 1.1 is > > "ASK" DatasetClause* WhereClause SolutionModifier > > The syntax for WhereClause is > > 'WHERE'? GroupGraphPattern > > The syntax for EXISTS constructs SPARQL 1.1 is > > 'EXISTS' GroupGraphPattern > > Stripping the ASK from the beginning of an ASK query does not generally end up with a GroupGraphPattern that can be used as the argument for EXISTS. > > Comment (HK): Thanks for pointing out this detail. I have tried to address this with: https://github.com/w3c/data-shapes/commit/d820e0bac287944fb13edc86040995927f02e20d > It appears that the values of sh:ask are never used as ASK queries by SHACL processors. Why then are these of the form of ASK queries? > > Comment (HK): While in theory we could have stated GroupGraphPattern, I think ASK is more intuitive to explain and allows stand-alone execution with copy and paste. Furthermore they align with the use of functions, which can also have ASK queries as their bodies. The syntax of ASK queries doesn't match the syntax required for EXISTS and is thus unsuitable here. The conniptions required to get them to sort of match show just how unsuitable this is. > different levels of SHACL implementation > > There are several different kinds of SHACL implementations that are hinted at in the document. > > "SHACL implementations may, but are not required to, support entailment regimes." "Access to the shapes graph is not a requirement for supporting the SHACL Core language." "This sections [sic] defines the built-in SHACL constraint components that MUST be supported by all SHACL validation engines." "Not all SHACL validation engines need to support this variable." "The same support policies as for $shapesGraph apply for this variable." "SPARQL engines with full SHACL support can install a new SPARQL function based on the SPARQL 1.1 Extensible Value Testing mechanism." "SHACL validation engines are not required to support any entailment regimes." "SHACL implementations with full support of the SHACL SPARQL extension mechanism must implement a function sh:hasShape, ...." "A SHACL validation engine MUST implement all constructs in the Core of SHACL (Sections 2, 3, 4). A SHACL engine MAY not implement the other parts of SHACL." "Implementations that cover only the the SHACL Core features are not required to implement these mechanisms or the sh:hasShape function." "SHACL validation engines MAY pre-bind the variable $shapesGraph to provide access to the shapes graph." "A SHACL validation engine MAY use such suggestions to determine which shapes graph to use for validating a data graph." "A SHACL validation engine MAY take this information into account to determine which shapes graph to use for validating a data graph that uses that ontology or vocabulary." > > There needs to be a section that explicitly defines the different levels of implementation. > > Comment (HK): Not sure what to do about this. There is an almost infinite amount of combinations of these above, so one could define many dialects. But only one of them is the full SHACL. I would prefer all SHACL engines to support all these features but there was too much resistance, e.g. from those favoring a single-query-code-generation approach or working against SPARQL end points. The resulting mess is reflecting the heterogeneous nature of the SPARQL universe, whether we want it or not. > Comment (DK): What if we created a section at the end of part II called "Optional features of the SHACL SPARQL extension mechanism" (or something similar) where we list all option features > Comment (HK): Ok, I have added an appendix with the goal of enumerating all optional features. Could you double check this: https://github.com/w3c/data-shapes/commit/e198bc9689c95e89e8caeb8c3c787b9efa579856 This does not appear to address my concerns. How many different levels of SHACL implementation are there? For examples, can a SHACL implementation implement SPARQL-based constraints but not access to the shapes graph, or some other random set of the optional parts of SHACL? > order of processing for filters > > The discussion of how filters are processed appears to be contradictory. First there is: "SHACL validation engines MAY alter the order of the depicted steps as long as the returned validation results are correct." Later there is: "Filter shapes MUST be evaluated before validating the associated shapes or constraints." > > Comment (HK): Yes, the first sentence is IMHO incorrect and I have taken it out (https://github.com/w3c/data-shapes/commit/3777e8e80aec9f9c1ba1bbb0dfdfce2b2acb9a12). The problem is that if an engine does filtering after validation, it may run into a failure that is otherwise not reached. I don't remember why we added that statement in the first place, do you @Dimitris? > Comment (DK): This was changed to address a comment from Peter on March 7th and resulted in this commit This appears to be two different responses. What is the situation? > $shapesGraph > > The status of $shapesGraph is unclear: "SPARQL variables using the $ marker represent external values that must be pre-bound or substituted in the SPARQL query before execution." "SHACL validation engines MAY pre-bind the variable $shapesGraph to provide access to the shapes graph." > > Comment (HK): The MAY is clarified in the following sentence (Access to the shapes graph is not a requirement etc). I believe it would be confusing to soften up the must in the first sentence because of this exception. It remains that there are two controlling wordings for how to handle $shapesGraph, one with a must (which probably should be MUST) and one with MAY. These appear to be contradictory. > circular filters > > What happens if a shape is one of its own filters? > > Comment (HK): The same as with other recursive scenarios - it's undefined. OK. > EXISTS and blank nodes > > The definition of ASK binds the value variable and then uses it inside an EXISTS. The definition of SPARQL provides a counter-intuitive result if this variable is bound to a blank node, resulting in, for example, a sh:class constraint with class ex:C returning no violation for _:d in any data graph containing the triple > > ex:c rdf:type ex:C . > > Comment (HK): We are awaiting input from the SPARQL Maintenance (EXISTS) community group. The document needs to mention where the problems with EXIST currently affect SHACL. > union operations on data graphs and shapes graphs > > It is unclear just what the data graph and the shapes graph are. There is wording that both of these cannot be changed. However, there is also wording that various kinds of union operations are to be performed on shapes and data graphs. > > Comment (HK): The only place I could find "union" was about handling of owl:imports, which states that the result of this union is used as shapes graph. This looks OK to me. Could you clarify what you mean? > Comment (DK): I tried to make the wording clearer here: https://github.com/w3c/data-shapes/commit/b6fd2db5719cc9c9bdec464acdd2aefc8d0b5b68 I don't find this much better. If the shapes graph and the data graph cannot be changed then there should not be wording about unioning, extending, or otherwise modifying the shapes graph or the data graph. > $targetNode > > It is unclear what is meant by: "The variable $targetNode is assumed to be pre-bound to the given value of sh:targetNode." Is this something that SHACL implementations have to do? There are several occurences of this kind of wording. > > Comment (HK): I don't see anything wrong here. "is assumed to" is IMHO OK because this section is merely describing the formal semantics without prescribing an implementation. Implementations will (almost certainly) not use a SPARQL query. The use of words like "assumed", particularly with no modifiers, is generally problematic in specifications. It certainly is problematic here. Instead of assuming that something is a particular way definitions should required that instead. > MAY > > MAY is used in 1.5 but defined in 1.6 > > Comment (HK): Ok, moved higher up https://github.com/w3c/data-shapes/commit/bda4e2c4781494ac0e26eb132c7e7dae15932739 OK > MAY 2 > > "A SHACL engine MAY not implement the other parts of SHACL." reads as if no SHACL engine is allowed to implement any non-core part of SHACL. > > Comment (HK): See https://github.com/w3c/data-shapes/commit/2ba049e6e39096bf47355b03d1de02c2e0e84f59 Better. > Graphs SHOULD > > "The data graph SHOULD include all the ontology axioms related to the data and especially all the rdfs:subClassOf triples in order for SHACL to correctly identify class targets and validate Core SHACL constraints." Data graphs are just graphs. How thus can SHOULD be applied to them? > > Comment (HK): I have replaced the SHOULD with "is expected to": https://github.com/w3c/data-shapes/commit/fd3fbeac7826f9df87111af878e65e34a502331c Better. > Suggestions > > "A SHACL validation engine MAY use such suggestions to determine which shapes graph to use for validating a data graph." Can this be done even when an explicit shapes graph is provided to the engine? > > Comment (HK): Attempted to clarify at https://github.com/w3c/data-shapes/commit/601631a5f4b965fa79f7b44a5a348702326ef315 Better, but retains the issue of changing the unchangeable. > Different shapes graph > > "The same mechanism applies for ontologies or vocabularies included in the shapes graph. The ontology or the vocabulary IRI can point to one or more shapes graphs with the predicate sh:shapesGraph. A SHACL validation engine MAY take this information into account to determine which shapes graph to use for validating a data graph that uses that ontology or vocabulary." If there already is a shapes graph in play, why is there any need for a different shapes graph to be used? > > Comment (HK): I have changed the prose to clarify that sh:shapesGraph only points at graphs, not shape graphs: https://github.com/w3c/data-shapes/commit/c88df2cf50cbc5f31feaabf610a0143d3ebcf0fb > Comment (DK): I removed the "in the shapes graph" here. This was meant as a general property for ontology design not only when it is used in one of the shapes/data graph But MAY SHACL implementations do this when they are explicitly given a shapes graph? > Deep copy > > "a deep copy of sh:path as its sh:path" What is "deep copy" in this context? > > Comment (HK): I have attempted to clarify this here: https://github.com/w3c/data-shapes/commit/d3f8f858f95b010d1f2a0e4681da203bcbfbc217 > Comment (kc): Unless "deep copy" has some pre-defined meaning that I am unaware of, I would suggest dropping it and saying: The value of sh:path of each validation result must copy all triples that are required by the <a href="#path-syntax">SHACL well-formed path syntax rules</a>from the <a>shapes graph</a> into the graph containing the validation results. > Comment (HK): The first google match of "deep copy" is pretty close to what I wanted to express, so I believe the term should be familiar to many people and may be helpful for implementers. Also I had surrounded the term with "...". Anyway, I have no strong opinion and let others decide. The extra wording is helpful. However, "deep copy" in https://en.wikipedia.org/wiki/Object_copying#Deep_copy is different. Either drop "deep copy" or point to an appropriate definition. > Filter role > > "A filter is a shape in a shapes graph that can be used to limit the nodes that are validated against a given constraint or shape." Are there some filters that cannot be used in this way? Which ones? > > Comment (HK): I don't understand this comment. The current statement does not exclude any filters from being used this way. > Comment (DK): This commit should fix this issue. Better. > Incomplete table > > "The following table enumerates variables that have special meaning in SPARQL constraints. When SPARQL constraints are executed, the validation engine should pre-bind values for these variables." However, many other variables also need to be pre-bound, such as the variables corresponding to parameters. > > Comment (HK): First, the statement above does not exclude other variables from being pre-bound. It doesn't claim that the table contains "all" variables. Second, this is in a chapter about SPARQL Constraints, where parameters have no meaning. So I don't think anything is wrong here. > Comment (DK): I think this commit helps more with this issue. I am not sure if we should move that table in the prebinding section since it affectd prebinding as a whole, not only SPARQL constraints Reading Section 5.3 still gives me the feeling that there is an implicit completeness consideration here. There are many other pre-bound variables. There are many other variables with special meaning. There should be clear wording to the effect that these are only three of the special pre-bound variables in SHACL.
Received on Friday, 23 September 2016 01:37:24 UTC