read-through of Sections 5 and 7 from Peter F. Patel-Schneider on 2016-11-26 (public-rdf-shapes@w3.org from November 2016)

From: Peter F. Patel-Schneider <pfpschneider@gmail.com>
Date: Sat, 26 Nov 2016 06:56:04 -0800
To: "public-rdf-shapes@w3.org" <public-rdf-shapes@w3.org>
Message-ID: <55e3673a-3350-c047-9f05-2484e152cd0a@gmail.com>
I did a read-through of Section 5 and Section 7.  I found many problems.
Some of these are likely simple problems of wording.  Many others are
problems with undefined or poorly defined notions.

There isn't even any definition or even description of how SPARQL-based
constraints actually work within the larger context of a SHACL Full system.
To see that, note that SPARQL-based constraints are to be "executed", but
there is no discussion of when that happens.


Section 5 SPARQL-based Constraints

"As elaborated in the section on prefix handling rules, the value of
sh:select must be transformable into a SPARQL 1.1 SELECT query. The query
must project the result variable this in its SELECT clause."    There is no
definition for the notion of projection here.

"The property sh:declare is used to make individual prefix declarations. The
SHACL vocabulary defines the class sh:PrefixDeclaration for the values of
sh:declare although no rdf:type triple is required for them."  Remove
'individual'.  What is the SHACL vocabulary?  Is the expected type
sh:PrefixDeclaration?

"The recommended subject for values of sh:declare is the IRI of the graph
containing the shapes that use the prefixes. These IRIs are often declared
as an instance of owl:Ontology, but this is not required."  Not all shapes
graphs will have IRIs.  How is an IRI declared as an instance?

"These nodes can use the property sh:prefixes to specify a set of prefix
mappings."  "The values of sh:prefixes must be IRIs or blank nodes. A SHACL
processor collects a set of prefix mappings as the union of all single
prefix mappings that can be reached by the property path
sh:prefixes/owl:imports*/sh:declare starting at the SPARQL-based
constraint. If such a collection of prefix declarations contains multiple
namespaces for the same sh:prefix, then the shapes graph is invalid. A SHACL
processor transforms the values of sh:select (and similar properties such as
sh:ask) into SPARQL by prepending PREFIX declarations for all namespace
prefix mappings. Each value of sh:prefix is turned into the PNAME_NS, while
each value of sh:namespace is turned into the IRIREF in the PREFIX
declaration."  What is a 'single prefix mapping'?  Why is this couched as
the actions of a SHACL processor instead of just being a relationship
between a node and a set of prefix mappings?  What happens if an invalid set
of prefix mappings exists in a shapes graph but is not used by any shape in
the graph?  What does 'the same sh:prefix' mean?  The values of sh:select
and similar properties are generally already SPARQL so why do they need to
be transformed *into* SPARQL?  What is a 'namespace prefix mapping'?

"The following table enumerates variables that have special meaning in
SPARQL constraints."  It is not clear whether this is a complete enumeration
(as suggested by the use of the word enumerate) or just some examples of
variables that have special meaning.

"When SPARQL constraints are executed, the SHACL Full processor pre-binds
values for these variables."  There is no notion of when SPARQL constraints
are to be executed.

"If one of the solutions of the result set produced by a SELECT query
contains the binding true for the variable failure, then the SHACL Full
processor MUST signal a failure."  Produced under what circumstances?

"Otherwise, each row of the result set produced by a SELECT query MUST be
converted into one validation result node."  This states that this MUST be
done for all queries, even those queries that produce result sets due to
being a value for sh:shape or similar parameters.

"The properties of those nodes are derived by the following rules, through a
combination of result variables and the properties linked to the constraint
itself. The production rules are meant to be executed from top to bottom, so
that the first bound value will be used."  What is a property of a node?
What is a production rule?

"The value of the variable path (only supports property IRIs, no complex
paths)"  What does the parenthetical remark mean?

"The values of sh:message of the subject of the sh:select or sh:ask
triple. These string literals may reference any binding of the SELECT result
variables via {?varName} or {$varName}. If the constraint is based on a
constraint component, then the component's parameter names can also be
used. The {?varName} blocks SHOULD be substituted with suitable string
representations of the values of said variables."  Which sh:select or sh:ask
triple?  How does a string literal reference anything?  What is a suitable
string representation of a variable value?  Why is this only a SHOULD?

"Any such property needs to be declared via a value of sh:resultAnnotation
at the subject holding the sh:select or sh:ask triple."  How does a subject
hold a triple?

"Property  Value type  Count  Description
sh:annotationProperty  rdf:Property  1 (mandatory)  The annotation property
that shall be set
sh:annotationVarName  xsd:string  0..1  The name of the SPARQL variable to
take the values from
sh:annotationValue   0..unlimited  Constant nodes that shall be used as
default values"
What does the value type column mean here?  What does the count column mean here?

"For each solution of a SELECT result set, a SHACL Full processor MUST walk
through the declared result annotations."  This should be specified as a
relationship, not something that a SHACL processor does.

"1. If a sh:resultAnnotation has a value for the property
sh:annotationVarName then the SHACL Full processor MUST look for the
variable named after the sh:annotationVarName 2. Otherwise, the SHACL Full
processor MUST derive a variable name from the value of
sh:annotationProperty using the same local name mechanism as described
earlier " What does it mean for a variable to named after something?  There
is no local name mechanism described earlier in the document.  In one case
the processor looks for something, in the other case the processor derives a
name.   These are different categories of action.

"If a variable name could be determined, then the SHACL Full processor MUST
copy the bindings for the given variable into the constructed validation
results for the current solution."  This reads as if all the bindings are
copied into each validation
result, which doesn't appear to make sense.  How are the values copied into
the result?

"The values of sh:annotationProperty must not be from the SHACL namespace,
to avoid clashes with variables that are already produced by other means."
This does not prevent clashese with other variables.

Section 7 SPARQL-based Targets

"All subjects of sh:target triples must be IRIs."  This makes absolutely no
sense.  Why should the shapes that have SPARQL-based targets be IRIs.

"The SPARQL queries linked to a target via sh:select must be of the query
form SELECT."   This doesn't apply the prefix handling rules.

"The SELECT queries must project to the result variable this."  There is no
definition for "project to".

"The resulting target consists of all distinct bindings for the variable
this."  The target is the value of sh:target so how can it be the bindings?

"The SELECT queries must also be executable when converted to an ASK query
and with a pre-bound value for ?this."    There is no definition for
converting a SELECT query to an ASK query.  There is no notion of a query
being executable.

"The set of bindings for ?this that return true for such ASK queries must be
identical to the set produced by the SELECT query. This design makes sure
that SHACL Full processors can validate whether a given shape applies to a
given individual focus node."   A SHACL Full processor can always just run
the SELECT query and check whether the individual focus node is in the
result set.   So this condition, which is difficult and maybe even possible
to check, is unnecessary.  The checking can even be done completely within
SPARQL by appending a values clause to the query, which can then be
optimized by the SPARQL processor, so there is not even any particular
reason to have this condition for efficiency purposes.

"Similar to constraint components, such targets take parameters that are
interpreted when the target is evaluated."  There is no notion of evaluation
of targets.

"All parameters of target types are expected to have sh:maxCount 1."  What
does expectation mean in the context of parameters?

SPARQL-based target types appear to have many of the same characteristics of
sh:SPARQLTarget targets.  Why then do the underlying queries not have the same
restrictions?  Why are the queries not restricted to those whose top-level
SELECT includes (or has only) this in the variables of its select clause?
Why are queries not restricted to ones that can be converted to ASK queries
that have the same behaviour?



Peter F. Patel-Schneider
Nuance Communications
Received on Saturday, 26 November 2016 14:56:39 UTC