comments on SHACL 3 March editors draft

General

There are quite MUSTs in the document that are inappropriately used.  For
example, instead of "a SHACL processor MUST use the union of the focus nodes
produced by these scopes" say "the scope of a shape is the union of the sets
of nodes produced by these scopes".  The place to use MUST is in wording
like "a SHACL processor MUST validate a shape against a data graph as
described herein".

The document specifies too much about how a SHACL engine is supposed to
perform validation.  Instead it SHOULD only specify what validation is and let
developers decide how to implement validation.

There are bad uses of "supposed to" and "expected to be".  What happens if
these suppositions or expectations are not met?  Instead use precise
wording.

SHACL should treat both the shapes graph and the data graph as unchanging.
Wording should be inserted to the effect that if the shapes graph or the
data graph changes during validation that the results are undefined.
Wording that implies temporal effects should be removed.

There needs to be some definition of just what a SHACL validation engine
does, something like "A SHACL validation engine takes two RDF graphs, a
shapes graph and a data graph, and validates the data graph against the
shapes graph as described herein.  SHACL tools may provide other interfaces,
e.g., one that takes an RDF graph in an RDF dataset, adds other triples to
that graph to produce the data graph, e.g., by using owl:imports triples,
and/or use triples in that graph to produce the shapes graph, e.g., using
sh:shapesGraph triples as described herein.  A SHACL validation engine MUST
implement all constructs in the core of SHACL (Sections 2, 3, 5).  A SHACL
engine MAY not implement the other parts of SHACL but if it does implement a
portion of ... it must implement all of ...."


Abstract

The abstract says that in SHACL "[a]dditional constraints can be associated
with shapes using SPARQL and similar extension languages."  This is not the
case as there is no specification for any execution language other than
SPARQL.

1. Introduction

The shapes graph is not a set of shapes.  Instead it contains a set of
shapes.

"data nodes" -> "nodes in the data graph"

The output of the validation process may be just a YES/NO.

The validation report may be truncated, and not include all validations.

The scope of the shape ex:IssueShape in EXAMPLE 1 is not "all nodes that
have an rdf:type link to ex:Issue".  The scope of the shape ex:IssueShape in
EXAMPLE 1 does not state "that all instances of the class ex:Issue shall be
validated against the shape ex:IssueShape".  The actual scope is instead
somewhere between these two sets of nodes.

It is not sufficient to say in 1.1 that SHACL has unique versions of types
and instances.  These notions are in very widespread use.  Each time that
SHACL deviates from the common, accepted W3C practice it should be called
out, e.g., "SHACL type" or "SHACL instance".

The paragraph on special cases is hard to understand.  At a minimum, the
effects of all these special cases need to be called out whenever they
occur.  It would be much better to remove the special cases.

The document doesn't actually use SPARQL 1.1 as the normative definition of
SHACL constraints and scopes.   What it does is intersperse SPARQL with
hasShape.  A cleaner statement is needed here to clarify the role of SPARQL
1.1 in the semantics of the core of SHACL.

The wording about sh:hasShape seems to imply SPARQL engines that only cover
the core need not provide a mechanism that mirrors sh:hasShape.  This might
be true at the moment, as recursion is not supported, but may not be true in
the future.

Graphs do not have URIs in general.  Instead they may be available at a URL.
They may also be in a dataset, in which case they may have an IRI (or
whatever) in that dataset.

2. Shapes

2.1 Scopes

Class-based scopes use rdfs:subClassOf as well.

"that are supposed to be validated against" -> "that are in the scope of"

"The scope of a shape is a set of nodes.  It is not necessary that a node be
present in the data graph to belong to a scope."

"Triples with predicate sh:scopeNode in the shapes graph from a shape to a
node create a scope for the shape containing only the node.  SPARQL
implementations MAY produce a warning if this node is not present in the
data graph."

Don't laud RDF Schema.  SHACL doesn't use RDF Schema.

"Triples with predicate sh:scopeClass in the shapes graph from a shape to a
node create a scope for the shape containing the nodes in the data graph
that are SHACL instances of the node in the data graph (linked to the node
via rdf:type or is linked to the node via rdf:type followed by a chain of
rdfs:subClassOf links in the data graph)."


At this point I stopped reading so carefully, but similar problems continue.


2.2 Filters

A SHACL processor might not begin by validating filters.  It might instead
look at all in-scope nodes and only later remove those that don't pass the
filters.  The document makes this illegal but it might be useful if the
filter shape is expensive to compute and few violations are expected.

The UML-like diagram is misleading.  It uses rdfs:Resource in a way
different from SHACL.

3. Core Constraint Types

"points at" is used with two different meanings in the document.  Variables
don't point at anything.  They have a value.


At this point I started skipping large chunks.


4. Declaring the Shapes Graph

RDF graphs don't have to be part of a dataset.  What happens then?

This should be moved to a section that discusses how to get to the stage
where there is a shapes graph and a data graph that can be validated against
it.


12. Entailment

RDF does not define the notion of the IRI of a graph.  Therefore if SHACL is
going to use this notion it must define it on its own.

Received on Sunday, 6 March 2016 21:00:18 UTC