Quick Comments on https://www.w3.org/TR/2016/WD-shacl-20160814/ from Peter F. Patel-Schneider on 2016-08-15 (public-rdf-shapes@w3.org from August 2016)

From: Peter F. Patel-Schneider <peter.patel-schneider@nuance.com>
Date: Mon, 15 Aug 2016 15:03:31 -0700
To: <public-rdf-shapes@w3.org>
Message-ID: <7b8667af-1f3e-9cdb-58bc-e4a7fc367565@nuance.com>
Here are a few of the problems with the current public working draft I found
during a quick scan of it.

* pre-binding

SPARQL does not evaluate variables that occur in basic graph patterns.  This
means that the definition of pre-binding has unusual behaviour.  For
example, the normative SPARQL definition of sh:class will return validation
results for every pair of nodes in the graph such that there is an
rdf:type/rdfs:subClass* path from the first to the second.

This problem affects many parts of the definition of SHACL.  It means that
the normative definition of many SHACL constructs is counter to intuitions.
This problem is not ameliorated by the caution box in Appendix B.

* syntax of SPARQL variables

SPARQL treats $ and ? as equivalent so $PATH and ?PATH both refer to the
PATH variable.  SHACL uses $ as a special marker and includes $ and ? as
part of the variable.

Would ?PATH be substituted as $PATH is?  If a SPARQL query for a SHACL
constraint only used ?this would the variable this be pre-bound?

* pre-binding optional?

"SPARQL variables using the $ marker represent external values that must be
pre-bound or substituted in the SPARQL query before execution."
"When SPARQL constraints are executed, the validation engine should pre-bind
values for these variables."
Are some $-marked variables not necessarily pre-bound, counter to the
earlier requirement?

* $PATH vs other $-prefixed variables

The variable PATH is treated specially in SHACL.  However, the general
description of $ does not specially call out PATH:
"SPARQL variables using the $ marker represent external values that must be
pre-bound or substituted in the SPARQL query before execution."

* $value

$value is used in many ASK queries.  However the definition of ASK
validators does not appear to pre-bind value.

* aggregation

The prohibition "Furthermore, any query that uses the variable $this in an
aggregation is invalid." is vague.  It appears to disallow the use of $this
in any part of the SPARQL 1.1 aggregation machinery, as the pointer in the
sentence is to Section 11 of the SPARQL specification.  This would rule out
all of the examples of aggregation in the SHACL document.

* ASK validators syntax

The syntax for ASK queries in SPARQL 1.1 is
  "ASK" DatasetClause* WhereClause SolutionModifier
The syntax for WhereClause is
  'WHERE'? GroupGraphPattern
The syntax for EXISTS constructs SPARQL 1.1 is
  'EXISTS' GroupGraphPattern
Stripping the ASK from the beginning of an ASK query does not generally end
up with a GroupGraphPattern that can be used as the argument for EXISTS.

It appears that the values of sh:ask are never used as ASK queries by SHACL
processors.  Why then are these of the form of ASK queries?

* different levels of SHACL implementation

There are several different kinds of SHACL implementations that are hinted
at in the document.

"SHACL implementations may, but are not required to, support entailment
regimes."
"Access to the shapes graph is not a requirement for supporting the SHACL
Core language."
"This sections [sic] defines the built-in SHACL constraint components that
MUST be supported by all SHACL validation engines."
"Not all SHACL validation engines need to support this variable."
"The same support policies as for $shapesGraph apply for this variable."
"SPARQL engines with full SHACL support can install a new SPARQL function
based on the SPARQL 1.1 Extensible Value Testing mechanism."
"SHACL validation engines are not required to support any entailment regimes."
"SHACL implementations with full support of the SHACL SPARQL extension
mechanism must implement a function sh:hasShape, ...."
"A SHACL validation engine MUST implement all constructs in the Core of SHACL
(Sections 2, 3, 4). A SHACL engine MAY not implement the other parts of
SHACL."
"Implementations that cover only the the SHACL Core features are not
required to implement these mechanisms or the sh:hasShape function."
"SHACL validation engines MAY pre-bind the variable $shapesGraph to provide
access to the shapes graph."
"A SHACL validation engine MAY use such suggestions to determine which shapes
graph to use for validating a data graph."
"A SHACL validation engine MAY take this information into account to
determine which shapes graph to use for validating a data graph that uses
that ontology or vocabulary."
 
There needs to be a section that explicitly defines the different levels of
implementation.

* order of processing for filters

The discussion of how filters are processed appears to be contradictory.
First there is:
"SHACL validation engines MAY alter the order of the depicted steps as long
as the returned validation results are correct."
Later there is:
"Filter shapes MUST be evaluated before validating the associated shapes or
constraints."

* $shapesGraph

The status of $shapesGraph is unclear:
"SPARQL variables using the $ marker represent external values that must be
pre-bound or substituted in the SPARQL query before execution."
"SHACL validation engines MAY pre-bind the variable $shapesGraph to provide
access to the shapes graph."

* circular filters

What happens if a shape is one of its own filters?

* EXISTS and blank nodes

The definition of ASK binds the value variable and then uses it inside an
EXISTS.  The definition of SPARQL provides a counter-intuitive result if
this variable is bound to a blank node, resulting in, for example, a
sh:class constraint with class ex:C returning no violation for _:d in any
data graph containing the triple
  ex:c rdf:type ex:C .

* union operations on data graphs and shapes graphs

It is unclear just what the data graph and the shapes graph are.  There is
wording that both of these cannot be changed.  However, there is also
wording that various kinds of union operations are to be performed on shapes
and data graphs.


* It is unclear what is meant by:  "The variable $targetNode is assumed to
  be pre-bound to the given value of sh:targetNode."  Is this something that
  SHACL implementations have to do?  There are several occurences of this
  kind of wording.

* MAY is used in 1.5 but defined in 1.6

* "A SHACL engine MAY not implement the other parts of SHACL." reads as if
  no SHACL engine is allowed to implement any non-core part of SHACL.

* "The data graph SHOULD include all the ontology axioms related to the data
  and especially all the rdfs:subClassOf triples in order for SHACL to
  correctly identify class targets and validate Core SHACL constraints."
  Data graphs are just graphs.  How thus can SHOULD be applied to them?

* "A SHACL validation engine MAY use such suggestions to determine which
  shapes graph to use for validating a data graph."  Can this be done even
  when an explicit shapes graph is provided to the engine?

* "The same mechanism applies for ontologies or vocabularies included in the
  shapes graph. The ontology or the vocabulary IRI can point to one or more
  shapes graphs with the predicate sh:shapesGraph. A SHACL validation engine
  MAY take this information into account to determine which shapes graph to
  use for validating a data graph that uses that ontology or vocabulary."
  If there already is a shapes graph in play, why is there any need for a
  different shapes graph to be used?

* "a deep copy of sh:path as its sh:path"  What is "deep copy" in this
  context?

* "A filter is a shape in a shapes graph that can be used to limit the nodes
  that are validated against a given constraint or shape."   Are there some
  filters that cannot be used in this way?  Which ones?

* "The following table enumerates variables that have special meaning in
  SPARQL constraints. When SPARQL constraints are executed, the validation
  engine should pre-bind values for these variables."  However, many other
  variables also need to be pre-bound, such as the variables corresponding
  to parameters.
Received on Monday, 15 August 2016 22:04:07 UTC