- From: Peter F. Patel-Schneider <pfpschneider@gmail.com>
- Date: Sat, 26 Nov 2016 06:56:04 -0800
- To: "public-rdf-shapes@w3.org" <public-rdf-shapes@w3.org>
I did a read-through of Section 5 and Section 7. I found many problems. Some of these are likely simple problems of wording. Many others are problems with undefined or poorly defined notions. There isn't even any definition or even description of how SPARQL-based constraints actually work within the larger context of a SHACL Full system. To see that, note that SPARQL-based constraints are to be "executed", but there is no discussion of when that happens. Section 5 SPARQL-based Constraints "As elaborated in the section on prefix handling rules, the value of sh:select must be transformable into a SPARQL 1.1 SELECT query. The query must project the result variable this in its SELECT clause." There is no definition for the notion of projection here. "The property sh:declare is used to make individual prefix declarations. The SHACL vocabulary defines the class sh:PrefixDeclaration for the values of sh:declare although no rdf:type triple is required for them." Remove 'individual'. What is the SHACL vocabulary? Is the expected type sh:PrefixDeclaration? "The recommended subject for values of sh:declare is the IRI of the graph containing the shapes that use the prefixes. These IRIs are often declared as an instance of owl:Ontology, but this is not required." Not all shapes graphs will have IRIs. How is an IRI declared as an instance? "These nodes can use the property sh:prefixes to specify a set of prefix mappings." "The values of sh:prefixes must be IRIs or blank nodes. A SHACL processor collects a set of prefix mappings as the union of all single prefix mappings that can be reached by the property path sh:prefixes/owl:imports*/sh:declare starting at the SPARQL-based constraint. If such a collection of prefix declarations contains multiple namespaces for the same sh:prefix, then the shapes graph is invalid. A SHACL processor transforms the values of sh:select (and similar properties such as sh:ask) into SPARQL by prepending PREFIX declarations for all namespace prefix mappings. Each value of sh:prefix is turned into the PNAME_NS, while each value of sh:namespace is turned into the IRIREF in the PREFIX declaration." What is a 'single prefix mapping'? Why is this couched as the actions of a SHACL processor instead of just being a relationship between a node and a set of prefix mappings? What happens if an invalid set of prefix mappings exists in a shapes graph but is not used by any shape in the graph? What does 'the same sh:prefix' mean? The values of sh:select and similar properties are generally already SPARQL so why do they need to be transformed *into* SPARQL? What is a 'namespace prefix mapping'? "The following table enumerates variables that have special meaning in SPARQL constraints." It is not clear whether this is a complete enumeration (as suggested by the use of the word enumerate) or just some examples of variables that have special meaning. "When SPARQL constraints are executed, the SHACL Full processor pre-binds values for these variables." There is no notion of when SPARQL constraints are to be executed. "If one of the solutions of the result set produced by a SELECT query contains the binding true for the variable failure, then the SHACL Full processor MUST signal a failure." Produced under what circumstances? "Otherwise, each row of the result set produced by a SELECT query MUST be converted into one validation result node." This states that this MUST be done for all queries, even those queries that produce result sets due to being a value for sh:shape or similar parameters. "The properties of those nodes are derived by the following rules, through a combination of result variables and the properties linked to the constraint itself. The production rules are meant to be executed from top to bottom, so that the first bound value will be used." What is a property of a node? What is a production rule? "The value of the variable path (only supports property IRIs, no complex paths)" What does the parenthetical remark mean? "The values of sh:message of the subject of the sh:select or sh:ask triple. These string literals may reference any binding of the SELECT result variables via {?varName} or {$varName}. If the constraint is based on a constraint component, then the component's parameter names can also be used. The {?varName} blocks SHOULD be substituted with suitable string representations of the values of said variables." Which sh:select or sh:ask triple? How does a string literal reference anything? What is a suitable string representation of a variable value? Why is this only a SHOULD? "Any such property needs to be declared via a value of sh:resultAnnotation at the subject holding the sh:select or sh:ask triple." How does a subject hold a triple? "Property Value type Count Description sh:annotationProperty rdf:Property 1 (mandatory) The annotation property that shall be set sh:annotationVarName xsd:string 0..1 The name of the SPARQL variable to take the values from sh:annotationValue 0..unlimited Constant nodes that shall be used as default values" What does the value type column mean here? What does the count column mean here? "For each solution of a SELECT result set, a SHACL Full processor MUST walk through the declared result annotations." This should be specified as a relationship, not something that a SHACL processor does. "1. If a sh:resultAnnotation has a value for the property sh:annotationVarName then the SHACL Full processor MUST look for the variable named after the sh:annotationVarName 2. Otherwise, the SHACL Full processor MUST derive a variable name from the value of sh:annotationProperty using the same local name mechanism as described earlier " What does it mean for a variable to named after something? There is no local name mechanism described earlier in the document. In one case the processor looks for something, in the other case the processor derives a name. These are different categories of action. "If a variable name could be determined, then the SHACL Full processor MUST copy the bindings for the given variable into the constructed validation results for the current solution." This reads as if all the bindings are copied into each validation result, which doesn't appear to make sense. How are the values copied into the result? "The values of sh:annotationProperty must not be from the SHACL namespace, to avoid clashes with variables that are already produced by other means." This does not prevent clashese with other variables. Section 7 SPARQL-based Targets "All subjects of sh:target triples must be IRIs." This makes absolutely no sense. Why should the shapes that have SPARQL-based targets be IRIs. "The SPARQL queries linked to a target via sh:select must be of the query form SELECT." This doesn't apply the prefix handling rules. "The SELECT queries must project to the result variable this." There is no definition for "project to". "The resulting target consists of all distinct bindings for the variable this." The target is the value of sh:target so how can it be the bindings? "The SELECT queries must also be executable when converted to an ASK query and with a pre-bound value for ?this." There is no definition for converting a SELECT query to an ASK query. There is no notion of a query being executable. "The set of bindings for ?this that return true for such ASK queries must be identical to the set produced by the SELECT query. This design makes sure that SHACL Full processors can validate whether a given shape applies to a given individual focus node." A SHACL Full processor can always just run the SELECT query and check whether the individual focus node is in the result set. So this condition, which is difficult and maybe even possible to check, is unnecessary. The checking can even be done completely within SPARQL by appending a values clause to the query, which can then be optimized by the SPARQL processor, so there is not even any particular reason to have this condition for efficiency purposes. "Similar to constraint components, such targets take parameters that are interpreted when the target is evaluated." There is no notion of evaluation of targets. "All parameters of target types are expected to have sh:maxCount 1." What does expectation mean in the context of parameters? SPARQL-based target types appear to have many of the same characteristics of sh:SPARQLTarget targets. Why then do the underlying queries not have the same restrictions? Why are the queries not restricted to those whose top-level SELECT includes (or has only) this in the variables of its select clause? Why are queries not restricted to ones that can be converted to ASK queries that have the same behaviour? Peter F. Patel-Schneider Nuance Communications
Received on Saturday, 26 November 2016 14:56:39 UTC