- From: Peter F. Patel-Schneider <pfpschneider@gmail.com>
- Date: Fri, 5 May 2017 07:12:35 -0700
- To: "public-rdf-shapes@w3.org" <public-rdf-shapes@w3.org>
This is a formal objection to the exclusion from SHACL of numerous shapes that have well-behaved intuitive meanings. A number of these exclusions first appear in the 11 April 2017 version of SHACL. The overall severity of these exclusions was noticed during an attempt to produce an implementation of SHACL Core. There is no technical reason to exclude numerous shapes that have been excluded from SHACL and good reasons to permit them. The exclusions make SHACL harder to write for users. As SHACL implementations are free to behave however they want on shapes graphs that contain these excluded shapes the exclusions serve as an impediment to interoperability in SHACL. In most cases the only change required to SHACL to no longer exclude these shapes and thereby improve the utility and the interoperability of SHACL is to just remove syntax rules that exclude them. There is no need to change the semantics of SHACL at all to include these shapes. The burden on implementators will be very low, and in many cases there will be no burden at all, as these excluded shapes act like similar non-excluded shapes. Some of these exclusions are as if a programming language made (x>4 && x>6) illegal, but left (x>4 && x>=6) legal. These syntax restrictions make it hard for SHACL users to write legal SHACL shapes graphs. If there is a need to warn users about these shapes, it is better to do so using a lint-like tool that is not tied to the syntax of SHACL. Actually, the situation is even worse in SHACL than it would be if a programming language had these sorts of exclusions. Completely conforming SHACL implementations are free to do anything at all on most invalid syntax so a SHACL implementation can signal an error or return any set of validation results on shapes that contain the analog of (x>4 && x>6), even though the intuitive behaviour of these shapes is completely clear. Some of the excluded shapes are degenerate in that they will produce validation results for all focus nodes or all focus nodes that have a value for a particular property. For example, the excluded shape ex:false a sh:PropertyShape ; sh:path ex:p ; sh:datatype xsd:integer ; sh:datatype xsd:int . will produce a validation result for every focus node that has a value for ex:p. However, many shapes that are degenerate in this sense are not excluded. Others of the excluded shapes are not degenerate in this way but instead have constraints that are redundant. For example, in the excluded shape ex:redundant a sh:PropertyShape ; sh:path ex:p ; sh:minInclusive 5 ; sh:minInclusive 9 . one of the constraints is redundant. However, many similar shapes that also have redundant constraints are not excluded. Some of the excluded shapes don't even have any redundant constraints. For example in the excluded shape ex:dateTime a sh:PropertyShape ; sh:path ex:p ; sh:minInclusive "2002-10-10T12:00:00-05:00"^^xsd:dateTime ; sh:minInclusive "2002-10-10T12:00:00"^^xsd:dateTime . neither of the constraints are redundant. To exclude all dateTime values before "2002-10-10T12:00:00-05:00"^^xsd:dateTime and also before "2002-10-10T12:00:00"^^xsd:dateTime requires two uses of sh:minInclusive. Some others of the excluded shapes are somewhat degenerate for standard RDF graphs but not for generalized RDF graphs, where literals can be subject of triples. For example, in standard RDF the excluded shape ex:generalized a sh:NodeShape ; sh:lessThan ex:p . produces a validation report for nodes that have values for ex:p and can be better stated for standard RDF graphs as a property constraint with path sh:p and sh:maxCount 0. In generalized RDF this shape is not degenerate as it also produces a validation report for some literals. However, many shapes that are degenerate in this way for standard RDF graphs but not degenerate for generalized RDF graphs are not excluded. Yet others of the excluded shapes are not excluded for any discernible reason at all. For example, the excluded shape ex:comments a sh:PropertyShape ; sh:path [ rdfs:comment "inverse of ex:p" ; sh:inversePath ex:p ] ; sh:class ex:C . does not have any problems whatsoever. Its meaning is clear. It does not accept or reject all nodes. It does not have any redundant pieces. It is excluded for no discernable reason. It is easy for users to accidentally write these excluded shapes. Automatic generation of SHACL shapes is made harder by these exclusions. There are extra costs when implementing SHACL syntax checking because of these exclusions. Interoperability is particularly harmed by these exclusions. Fully conforming SHACL implementations may implement ex:false as producing a validation report for all nodes. They may instead implement ex:false as never producing a validation report, on the argument that multiple sh:datatype constraints should just be removed. They may implement ex:false as not producing a validation report for literals with datatype xsd:integer, or xsd:int, because the implementation assumes that only one sh:datatype can be present. They may even implement ex:false as sometimes not producing a validation report for literals with datatype xsd:integer, and sometimes not producing a validation report for literals with datatype xsd:int. These exclusions should be removed from SHACL by removing the syntax rules datatype-maxCount, nodeKind-maxCount, minCount-scope, minCount-maxCount, maxCount-scope, maxCount-maxCount, minExclusive-maxCount, minInclusive-maxCount, maxExclusive-maxCount, maxInclusive-maxCount, minLength-maxCount, maxLength-maxCount, languageIn-maxCount, uniqueLang-scope, lessThan-scope, lessThanOrEquals-scope, qualifiedValueShape-scope, and in-maxCount and changing the syntax rules path-metarule, path-non-recursive, path-predicate, path-sequence, path-alternative, path-inverse, path-zero-or-more, path-one-or-more, and path-zero-or-one to path-non-recursive A node p is not a well-formed SHACL property path if p is a blank node and any of the following rules require, directly or indirectly, determining whether p is a well-formed SHACL property path. path-metarule A node is a well-formed SHACL property path if it satisfies exactly one of the following rules and if it is a blank node it does not have a value for more than one of rdf:first or rdf:rest, sh:alternativePath, sh:inversePath, sh:zeroOrMorePath, sh:oneOrMorePath, and sh:zeroOrOnePath. path-predicate A predicate path is any IRI. path-sequence A sequence path is a blank node that is a SHACL list with at least two members and each member of the list is a well-formed SHACL property path. path-alternative An alternative path is a blank node that has exactly one value for sh:alternativePath and that value is a SHACL list with at least two members and each member of the list is a well-formed SHACL property path. path-inverse An inverse path is a blank node that has exactly one value for sh:inversePath and that value is a well-formed SHACL property path. path-zero-or-more A zero-or-more path is a blank node that has exactly one value for sh:zeroOrMorePath and that value is a well-formed SHACL property path. path-one-or-more A one-or-more path is a blank node that has exactly one value for sh:oneOrMorePath and that value is a well-formed SHACL property path. path-zero-or-one A zero-or-one path is a blank node that has exactly one value for sh:zeroOrOnePath and that value is a well-formed SHACL property path. The change to path syntax not only permits paths that should not be excluded but also excludes paths that should not be included, such as [ rdf:first ex:p ; rdf:rest ( ex:q ) ; sh:inversePath ex:p ] These changes to the syntax of SHACL results in a SHACL that is easier to write, easier to generate, easier to implement, and more interoperable.
Received on Friday, 5 May 2017 14:13:11 UTC