- From: Peter F. Patel-Schneider <pfpschneider@gmail.com>
- Date: Sun, 6 Mar 2016 12:59:42 -0800
- To: RDF Data Shapes Working Group <public-data-shapes-wg@w3.org>
General There are quite MUSTs in the document that are inappropriately used. For example, instead of "a SHACL processor MUST use the union of the focus nodes produced by these scopes" say "the scope of a shape is the union of the sets of nodes produced by these scopes". The place to use MUST is in wording like "a SHACL processor MUST validate a shape against a data graph as described herein". The document specifies too much about how a SHACL engine is supposed to perform validation. Instead it SHOULD only specify what validation is and let developers decide how to implement validation. There are bad uses of "supposed to" and "expected to be". What happens if these suppositions or expectations are not met? Instead use precise wording. SHACL should treat both the shapes graph and the data graph as unchanging. Wording should be inserted to the effect that if the shapes graph or the data graph changes during validation that the results are undefined. Wording that implies temporal effects should be removed. There needs to be some definition of just what a SHACL validation engine does, something like "A SHACL validation engine takes two RDF graphs, a shapes graph and a data graph, and validates the data graph against the shapes graph as described herein. SHACL tools may provide other interfaces, e.g., one that takes an RDF graph in an RDF dataset, adds other triples to that graph to produce the data graph, e.g., by using owl:imports triples, and/or use triples in that graph to produce the shapes graph, e.g., using sh:shapesGraph triples as described herein. A SHACL validation engine MUST implement all constructs in the core of SHACL (Sections 2, 3, 5). A SHACL engine MAY not implement the other parts of SHACL but if it does implement a portion of ... it must implement all of ...." Abstract The abstract says that in SHACL "[a]dditional constraints can be associated with shapes using SPARQL and similar extension languages." This is not the case as there is no specification for any execution language other than SPARQL. 1. Introduction The shapes graph is not a set of shapes. Instead it contains a set of shapes. "data nodes" -> "nodes in the data graph" The output of the validation process may be just a YES/NO. The validation report may be truncated, and not include all validations. The scope of the shape ex:IssueShape in EXAMPLE 1 is not "all nodes that have an rdf:type link to ex:Issue". The scope of the shape ex:IssueShape in EXAMPLE 1 does not state "that all instances of the class ex:Issue shall be validated against the shape ex:IssueShape". The actual scope is instead somewhere between these two sets of nodes. It is not sufficient to say in 1.1 that SHACL has unique versions of types and instances. These notions are in very widespread use. Each time that SHACL deviates from the common, accepted W3C practice it should be called out, e.g., "SHACL type" or "SHACL instance". The paragraph on special cases is hard to understand. At a minimum, the effects of all these special cases need to be called out whenever they occur. It would be much better to remove the special cases. The document doesn't actually use SPARQL 1.1 as the normative definition of SHACL constraints and scopes. What it does is intersperse SPARQL with hasShape. A cleaner statement is needed here to clarify the role of SPARQL 1.1 in the semantics of the core of SHACL. The wording about sh:hasShape seems to imply SPARQL engines that only cover the core need not provide a mechanism that mirrors sh:hasShape. This might be true at the moment, as recursion is not supported, but may not be true in the future. Graphs do not have URIs in general. Instead they may be available at a URL. They may also be in a dataset, in which case they may have an IRI (or whatever) in that dataset. 2. Shapes 2.1 Scopes Class-based scopes use rdfs:subClassOf as well. "that are supposed to be validated against" -> "that are in the scope of" "The scope of a shape is a set of nodes. It is not necessary that a node be present in the data graph to belong to a scope." "Triples with predicate sh:scopeNode in the shapes graph from a shape to a node create a scope for the shape containing only the node. SPARQL implementations MAY produce a warning if this node is not present in the data graph." Don't laud RDF Schema. SHACL doesn't use RDF Schema. "Triples with predicate sh:scopeClass in the shapes graph from a shape to a node create a scope for the shape containing the nodes in the data graph that are SHACL instances of the node in the data graph (linked to the node via rdf:type or is linked to the node via rdf:type followed by a chain of rdfs:subClassOf links in the data graph)." At this point I stopped reading so carefully, but similar problems continue. 2.2 Filters A SHACL processor might not begin by validating filters. It might instead look at all in-scope nodes and only later remove those that don't pass the filters. The document makes this illegal but it might be useful if the filter shape is expensive to compute and few violations are expected. The UML-like diagram is misleading. It uses rdfs:Resource in a way different from SHACL. 3. Core Constraint Types "points at" is used with two different meanings in the document. Variables don't point at anything. They have a value. At this point I started skipping large chunks. 4. Declaring the Shapes Graph RDF graphs don't have to be part of a dataset. What happens then? This should be moved to a section that discusses how to get to the stage where there is a shapes graph and a data graph that can be validated against it. 12. Entailment RDF does not define the notion of the IRI of a graph. Therefore if SHACL is going to use this notion it must define it on its own.
Received on Sunday, 6 March 2016 21:00:18 UTC