- From: Holger Knublauch <holger@topquadrant.com>
- Date: Mon, 27 Feb 2017 12:13:31 +1000
- To: public-rdf-shapes@w3.org
I have raised https://www.w3.org/2014/data-shapes/track/issues/234 to track our response. We will likely respond after the next WG meeting this week. Holger On 26/02/2017 1:03, Peter F. Patel-Schneider wrote: > Here are some comments on this document. In summary, there are still lots > of significant problems. Addressing these problems will result in > substantial changes to this document. > > I have not examined the portions of the document labelled as non-normative > as closely as the normative sections. I have treated all portions of the > document labelled as non-normative as if they do not contain any normative > content. > > Some of the comments below are about problems that occur in multiple places > in the document. I have not always listed all of these places. > > It was hard to decipher the odd wording in many places to get to the > underlying meaning. The document needs a rewrite to state the definition of > SHACL in a consistent manner. Part of the problem here, but only part, is > the mixing of different and phrases that can carry definitional import, > including "if ... then", "declared", "specified", and "is". Another part of > the problem is the use of "MUST" in the definitions of constraint > components. > > This set of comments is separate from my previous comments on SHACL. > > > ** Problems in early definitions: > > "A property path is a possible route in a graph between two graph nodes." > Route is not defined. Possible route is not defined. (This wording is used > once in the SPARQL document but is not defined there.) > > "A binding is a pair (variable, RDF term), consistent with the term's use in > SPARQL." > If binding is taken from SPARQL just link to its definition as is done for > the other terms taken from SPARQL > > "A solution is a set of bindings, one row in the body of the result table of > a SPARQL query." > In SPARQL a solution is not one row in the body of the result table of a > SPARQL query. That is just how it is shown in some places. The actual > correct term is solution mapping. > > "The results table is a SolutionSequence, a list of solutions, possibly > unordered." > Results table is not used in SHACL. > > "A node in an RDF graph is a SHACL instance of a SHACL class in the graph if > one of its SHACL types is the class." > SHACL types require reference to a graph. > > "true denotes the RDF term "true"^^xsd:boolean. false denotes the RDF term > "false"^^xsd:boolean." > There is no notion of what denotes means here. > > "Target declarations are values of certain properties (such as > sh:targetClass) for a shape in a shapes graph." > More than just a value is needed. > > "All bindings of the variable this from the solution become focus nodes." > SPARQL queries do not return a solution as defined in this document. > > > ** Unclear wording: > > "(In this document, the verbs specify or declare are sometimes used to > express the fact that an RDF term has property values in a graph.)" > As opposed to an RDF term not having any values for any property in a graph? > > "A constraint component is an IRI." > I don't see how every IRI is a constraint component. > > "The IRI is used, among others, in validation reports." > I never would have imagined that validation reports could only use IRIs that > are constraint components. > > "SHACL-SPARQL can be used to declare additional constraint components based > on SPARQL." > What part of SHACL are these additional constraint components in? > > > ** Strange links: > > The definition of member has a link back to itself. > > > ** Normative wording in non-normative portions of the document: > > Section 1.6 is labelled as non-normative but discusses how to treat other > portions of the document and states conformance requirements for SHACL > implementations. > > In many places there are what are labelled as SPARQL definitions of SHACL > Core constraint components. These are in non-normative portions of the > document. The document needs to not make the impression that these bits of > SPARQL are definitions of portions of the SHACL Core. > > There are probably other places where definitional wording occurs in > non-normative parts of the document. These should be removed or changed to > not give any impression that they are normative. > > > ** SHACL Vocabulary: > > What is the status of the SHACL vocabulary? SHACL is not an ontology that > needs a RDF graph interpreted using the RDFS semantics to provide its > vocabulary. All that SHACL needs is a set of IRIs that are used in its RDF > syntax. What then is the status of the mentioned RDF graph? How is this > graph to be interpreted? Is information entailed by the graph using the RDF > or RDFS semantics have any effect on SHACL? Is any information in the graph > part of the definition of SHACL? Of SHACL Core? Is all of the information > in the graph part of the definition of SHACL? Of SHACL Core? > > > ** Shapes: > > If any node in a shapes graph has a sh:shape link back to itself then the > shapes graph is recursive and behaviour of SHACL processors on the graph is > undefined even if this node is completely disconnected from the rest of the > graph. It would be better to have the behaviour of SHACL processors defined > in cases like this. > > It used to be that top-level shapes were conventionally indicated by an > rdf:type link to sh:Shape. This has apparently changed to sh:NodeShape. > There is no apparent reason for this change, which should be changed back. > > sh:Shape does not appear to have any effect at all in SHACL. If this is the > case then it should be removed. > > The syntax rule for shapes doesn't appear to be a syntax rule at all. > Instead it is defining what a shape is. > > "A shape in a shapes graph declares a constraint of kind c if c is a > constraint component and the shape has values for all mandatory parameters > of c. The constraint declaration consists of the values that the shape has > for all mandatory and optional parameters of that component." > The word "declares" is only harmful here. > For constraint components that have more than one parameter this definition > loses which value is connected to which parameter. For shapes that have two > values for a single-parameter constraint component there is only one > resultant constraint and that constraint has both of the values of the > parameter. > > "Note that the definition above does not include all of the syntax rules of > well-formed shapes." > There is no notion of well-formed shapes introduced in the document, even > though it is used in several places. Similarly there is no notion of > well-formed property shape or well-formed node shape introduced in the > document even though both of these are used in the document. > > "Note that the definitions of well-formed property shapes and node shapes > make these two sets of nodes disjoint." > There is no definition of either well-formed property shape or well-formed > node shape. > > > ** Property Paths: > > "A node in an RDF graph is a well-formed SHACL property path p if it > satisfies exactly one of the syntax rules in the following sub-sections. A > node p is not a well-formed SHACL property path if p is a blank node and any > path mappings of p directly or transitively reference p." > It is possible that a node could both satisfy exactly one of the syntax > rules and also refer back to itself. What happens then? > Every path mapping of p references p so all blank nodes are not well-formed > SHACL property paths. > > "A sequence path is a blank node that is a SHACL list with at least two > members and each member is a well-formed SHACL property path." > Sequence paths can have extra information associated them. > > "An alternative path is a blank node that is the subject of exactly one > triple in G." "An inverse path is a blank node that is the subject of > exactly one triple in G." And so on. > These paths can't have extra information associated with them. Any blank > node that is the subject of exactly one triple is lots of kinds of paths. > > > ** Non-validating information > > What happens if the requirements in the non-normative 2.3.2 are violated? > > > ** Targets: > > "If s is a SHACL instance of sh:NodeShape or sh:PropertyShape in a shapes > graph SG and s is also a SHACL instance of rdfs:Class in SG then the set of > SHACL instances of s in a data graph DG is a target from DG for s in SG." > So a node that is a SHACL instance of sh:Shape and a SHACL instance of > rdfs:Class will not produce an implicit class target. This is going to > trip up a lot of people and needs to be changed. > > > ** Validation: > > "Conformance checking is a simplified version of validation, producing a > boolean result." > There is no definition of which boolean value conformance checking produces. > > "the validation process" > What counts as part of the validation process? Does checking for > ill-formedness? Does entailment? Does piecing together the shapes graph or > the data graph? > > "For example, SHACL processors may support recursion scenarios or produce an > error when they detect recursion." > I expect that this should be failure instead of error. > > "A shapes graph is an RDF graph containing zero or more shapes that is > passed into a SHACL validation process so that a data graph can be validated > against the shapes." > Is the graph that is used when a node in a graph is validated against a > shape in it a shapes graph? > > "Every value of sh:shapesGraph is an IRI representing a graph that should be > included into the shapes graph used to validate the data graph." > Shouldn't this be SHOULD? Or should it be MAY as is used in the next > sentence? > > "Validating an RDF term against a shape involves validating the term against > each of the components for which the shape has values for all mandatory > parameters, using the validators associated with the respective component." > This incorrectly uses constraint components instead of constraints. > > "The validation of a focus node in the data graph against a constraint in > the shapes graph produces the top-level validation results that are produced > by the validator of the constraint component, using as input the focus node, > the specific values of the parameters in the constraint, and the value nodes > of the shape in the data graph." > SHACL constraint components have multiple validators. > Validation needs more than just the information listed above. > Shapes don't have value nodes at all. > The output of a validation process is not completely defined, so there is no > notion of *the* results of validation. > > "A focus node conforms to a shape if and only if the validation of the shape > does not produce any validation result or a failure." > So no focus node will conform to a node shape that has a sh:not parameter. > > > ** Conformance: > > "All SHACL implementations MUST at least cover the Core." > Covering is not defined. > > "This specification describes conformance criteria for: SHACL Core [...], > SHACL-SPARQL [...], SHACL Shapes Graphs [...], Validation of a data graph > against a shapes graph [...], Validation of an RDF term from a data graph > against a shape from the shapes graph [...], SHACL Core processors [...], > SHACL-SPARQL processors [...]." > Conformance is something that is required of implementations. It doesn't > make sense for special kinds of graphs or processes to conform. Instead > these are defined as something. > > An RDF graph is a (mathematical) set of RDF triples. There are no > operations that can modify an RDF graph defined in RDF. > > > ** Core Constraint Components: > > "This section defines the built-in SHACL Core constraint components that > MUST be supported by all SHACL Core processors." > Are SPARQL-SHACL processors required to support these components as well? > > "The SPARQL definitions in this section represent potential validators." > The SPARQL queries are in informative parts of the document and can't be > considered to be definitions. As many of the SPARQL > definitions currently have problems, it would be better to just remove all > the SPARQL query stuff from this section. > > "The following constraint components represent restrictions on the number of > value nodes." > Presumably the number of value nodes for a particular focus node, not the > number of value nodes overall. > > "If this parameter is omitted then there is no limit on the number of > triples." > Which triples? How can this parameter be omitted and there still be a > constraint set up for it? > > "The values of sh:pattern are literals with datatype xsd:string that are > valid pattern arguments for the SPARQL REGEX function." > Having this as a syntax condition means that > checking for correct operation of SHACL processors will need to be aware of > whether a value for sh:pattern is a valid pattern argument for the SPARQL > REGEX function. > > "If $flags has a value then it MUST be interpreted according to the third > argument of the SPARQL REGEX function." > What does the mean? I'm guessing that the constraint component is supposed > to act as if this value is the third argument, not that this value is > interpreted as anything. > > "For each pair of value nodes and the values of the property $lessThan at > the given focus node where the first value is not less than the second value > (based on SPARQL's < operator) or where the two values cannot be compared, a > validation result MUST be produced with the value node as sh:value." > How many different validation results need to be produced if the set of > value nodes is the set { 1, 2 } and the set of property values is the set > { "a", "b" }? > > "For each value node that produces no validation results against the shape > $not" > So if there is a conjunction under the not, a validation result will be > produced here if just one of the conjuncts produces a validation result. > > "For each value node where the validation of the value node against any of > the members of $and produces a validation result and no failure, a > validation result MUST be produced with the value node as sh:value." > So if there is a negation under any of the conjunctions a validation result > will always be produced. > > "If a value node is violating the constraint, sh:shape will produce only a > single validation result, with sh:ShapeConstraintComponent as its > sh:sourceConstraintComponent." > Not correct. The sh:shape could produce other validation results for other > value nodes. > > "On the other hand side, sh:property may produce any number of validation > results, and these will have the individual constraint components of the > property shape as their values of sh:sourceConstraintComponent." > Not correct. The validation results produced here may have other values for > sh:sourceConstraintComponent. > > > ** Specify and Declare: > > Specify and declare are use throughout the document in strange ways. It > would be better to replace these with words that do not have so much > baggage, e.g., "For example, shapes can state" or "A shape in a shapes graph > has a constraint of". > > > ** SHACL-SPARQL: > > All aspects of SHACL-SPARQL depend heavily on pre-binding. As pre-binding > has never had a workable definition in SHACL there is no purpose in closely > reviewing this part of the document at this time. Either this part of the > document needs to be removed or a workable definition of SHACL provided and > this phase of the W3C process repeated. > > It is not clearly stated what SHACL Core processors need to do when they > encounter constructs from SHACL-SPARQL. It might be the case that SHACL > Core processors can completely ignore SHACL-SPARQL constructs, because SHACL > Core processors only "cover" SHACL Core, but it might also be the case that > SHACL Core processors need to examine SHACL-SPARQL constructs, because these > constructs might cause some nodes to be ill-formed. > > Both sh:prefixes and sh:prefix are used. I expect that only one is needed. > > Peter F. Patel-Schneider > Nuance Communications > > > >
Received on Monday, 27 February 2017 02:36:20 UTC