Re: read-through of Sections 5 and 7 (response to section 5) from Holger Knublauch on 2016-11-28 (public-rdf-shapes@w3.org from November 2016)

From: Holger Knublauch <holger@topquadrant.com>
Date: Mon, 28 Nov 2016 10:22:10 +1000
To: public-rdf-shapes@w3.org
Message-ID: <eabc8eef-3831-bbd0-8f6d-15bb273cf019@topquadrant.com>
On 27/11/2016 0:56, Peter F. Patel-Schneider wrote:
> I did a read-through of Section 5 and Section 7.  I found many problems.
> Some of these are likely simple problems of wording.  Many others are
> problems with undefined or poorly defined notions.
>
> There isn't even any definition or even description of how SPARQL-based
> constraints actually work within the larger context of a SHACL Full system.
> To see that, note that SPARQL-based constraints are to be "executed", but
> there is no discussion of when that happens.
>
>
> Section 5 SPARQL-based Constraints
>
> "As elaborated in the section on prefix handling rules, the value of
> sh:select must be transformable into a SPARQL 1.1 SELECT query. The query
> must project the result variable this in its SELECT clause."    There is no
> definition for the notion of projection here.

I have replaced "project" with "return" including a hyperlink to section 
16.1 of SPARQL 1.1 query.

>
> "The property sh:declare is used to make individual prefix declarations. The
> SHACL vocabulary defines the class sh:PrefixDeclaration for the values of
> sh:declare although no rdf:type triple is required for them."  Remove
> 'individual'.

Done.

>    What is the SHACL vocabulary?

Section 1 includes several mentions of this vocabulary, including a 
hyperlink to the Turtle file.

>    Is the expected type sh:PrefixDeclaration?

Not necessarily, what impact would it have? All we need to state here is 
that the rdf:type triple is optional.

>
> "The recommended subject for values of sh:declare is the IRI of the graph
> containing the shapes that use the prefixes. These IRIs are often declared
> as an instance of owl:Ontology, but this is not required."  Not all shapes
> graphs will have IRIs.

Yes, that's why I said "recommended" and "often".

>    How is an IRI declared as an instance?

Following the usual RDF techniques. See also the definition of the term 
SHACL Class Instance.

>
> "These nodes can use the property sh:prefixes to specify a set of prefix
> mappings."  "The values of sh:prefixes must be IRIs or blank nodes. A SHACL
> processor collects a set of prefix mappings as the union of all single
> prefix mappings that can be reached by the property path
> sh:prefixes/owl:imports*/sh:declare starting at the SPARQL-based
> constraint. If such a collection of prefix declarations contains multiple
> namespaces for the same sh:prefix, then the shapes graph is invalid. A SHACL
> processor transforms the values of sh:select (and similar properties such as
> sh:ask) into SPARQL by prepending PREFIX declarations for all namespace
> prefix mappings. Each value of sh:prefix is turned into the PNAME_NS, while
> each value of sh:namespace is turned into the IRIREF in the PREFIX
> declaration."  What is a 'single prefix mapping'?

I have replaced "single" with "individual". What remains unclear?

>    Why is this couched as
> the actions of a SHACL processor instead of just being a relationship
> between a node and a set of prefix mappings?

Why not? There are different ways of explaining the same thing. We have 
referred to SHACL processors in many other places.

>    What happens if an invalid set
> of prefix mappings exists in a shapes graph but is not used by any shape in
> the graph?

I believe this was already covered but I have added a sentence

(Note that SHACL processors MAY ignore prefix declarations that are 
never reached).

Which essentially means that tools may produce a warning but the shapes 
graph is not invalid. I have no strong opinion on this - if someone does 
feel strongly about this policy please let me know.

>    What does 'the same sh:prefix' mean?

Changed to "the same <a>value</a> of <code>sh:prefix</code>".

>    The values of sh:select
> and similar properties are generally already SPARQL so why do they need to
> be transformed *into* SPARQL?

The values of sh:select are not already SPARQL because they may lack 
prefix declarations.

>    What is a 'namespace prefix mapping'?

I have dropped the 'namespace'. The term 'prefix mapping' is used in the 
SPARQL spec and should be well-known. Would you prefer us to switch to 
'binding' instead of 'mapping', or what is the issue here?

>
> "The following table enumerates variables that have special meaning in
> SPARQL constraints."  It is not clear whether this is a complete enumeration
> (as suggested by the use of the word enumerate) or just some examples of
> variables that have special meaning.

Changed to "enumerates *the* variables", which hopefully clarifies the 
intent.

>
> "When SPARQL constraints are executed, the SHACL Full processor pre-binds
> values for these variables."  There is no notion of when SPARQL constraints
> are to be executed.

Switched to "When SPARQL constraints are processed, ..." with a link to 
the definition of validation in section 3. That section defines 
"processing". (I could use "validated" but then some people may complain 
that constraints are not validated but only the focus nodes are.)

>
> "If one of the solutions of the result set produced by a SELECT query
> contains the binding true for the variable failure, then the SHACL Full
> processor MUST signal a failure."  Produced under what circumstances?

Clarified as "produced by a SELECT query during a validation process ..."

>
> "Otherwise, each row of the result set produced by a SELECT query MUST be
> converted into one validation result node."  This states that this MUST be
> done for all queries, even those queries that produce result sets due to
> being a value for sh:shape or similar parameters.

I noticed that when we deleted sh:hasShape, we lost an important 
paragraph about this very issue. I have tried to restore its meaning by 
adding the following to the Validation Report section:

Any validation results produced by the processing of shapes as values 
ofshape-expecting constraint parameters 
<#dfn-shape-expecting-constraint-parameters>(such as|sh:shape|) are 
temporary, i.e. they are not added to the results graph of the 
surrounding validation process. However, some implementations may add 
those nested validation results as annotations to the surrounding 
validation results, via|sh:detail|.

With this back in place, I believe the case above is covered as 
originally intended again.

>
> "The properties of those nodes are derived by the following rules, through a
> combination of result variables and the properties linked to the constraint
> itself. The production rules are meant to be executed from top to bottom, so
> that the first bound value will be used."  What is a property of a node?

Switched to "the property values of those nodes".

> What is a production rule?

Dropped "production".

>
> "The value of the variable path (only supports property IRIs, no complex
> paths)"  What does the parenthetical remark mean?

Changed to "The value of the variable <code>path</code>, if this value 
is a <a>IRI</a>". The intent was to make clear that property paths 
cannot be produced this way.

>
> "The values of sh:message of the subject of the sh:select or sh:ask
> triple. These string literals may reference any binding of the SELECT result
> variables via {?varName} or {$varName}. If the constraint is based on a
> constraint component, then the component's parameter names can also be
> used. The {?varName} blocks SHOULD be substituted with suitable string
> representations of the values of said variables."  Which sh:select or sh:ask
> triple?

This really should be clear to the reader and elaborating on this has 
the only effect that the documents becomes less readable. And since this 
section is referenced from both sh:sparql constraints and constraint 
components, I would need to add complicating prose just to clarify which 
case in which situation.

>    How does a string literal reference anything?  What is a suitable
> string representation of a variable value?

I have adjusted the wording, see the diff link below.

> Why is this only a SHOULD?

Because this is for the human-readable message only, and there are 
different policies that different tools will want to follow. For example 
in our tool aimed at expert users (TopBraid Composer) we insert qnames, 
while our end-user facing tools insert rdfs:labels. If we make any of 
the string substitution stuff mandatory then we'd need to also specify 
the details about what needs to happen.

>
> "Any such property needs to be declared via a value of sh:resultAnnotation
> at the subject holding the sh:select or sh:ask triple."  How does a subject
> hold a triple?

Switched to "at the subject of the sh:select..."

>
> "Property  Value type  Count  Description
> sh:annotationProperty  rdf:Property  1 (mandatory)  The annotation property
> that shall be set
> sh:annotationVarName  xsd:string  0..1  The name of the SPARQL variable to
> take the values from
> sh:annotationValue   0..unlimited  Constant nodes that shall be used as
> default values"
> What does the value type column mean here?  What does the count column mean here?

Clarified with

In this table, the/Value type/column states the required SHACL class or 
datatype of the property values, and the/Count/column indicates the 
minimum and maximum number of values that the properties may have. If 
these value types and counts are violated then the shapes graph is invalid.

>
> "For each solution of a SELECT result set, a SHACL Full processor MUST walk
> through the declared result annotations."  This should be specified as a
> relationship, not something that a SHACL processor does.

I disagree. Both forms are valid and have the same implications.

>
> "1. If a sh:resultAnnotation has a value for the property
> sh:annotationVarName then the SHACL Full processor MUST look for the
> variable named after the sh:annotationVarName 2. Otherwise, the SHACL Full
> processor MUST derive a variable name from the value of
> sh:annotationProperty using the same local name mechanism as described
> earlier " What does it mean for a variable to named after something?

Switched to

"for the variable with the same name as the value of 
<code>sh:annotationVarName</code>"

>    There
> is no local name mechanism described earlier in the document.  In one case
> the processor looks for something, in the other case the processor derives a
> name.   These are different categories of action.

Dropped "described earlier in the document" - the hyperlink goes 
wherever this was moved to.

Prose changed to

Otherwise, the SHACL Full processor/must/use thelocal name 
<#dfn-local-name>of the value of|sh:annotationProperty|as the variable name

>
> "If a variable name could be determined, then the SHACL Full processor MUST
> copy the bindings for the given variable into the constructed validation
> results for the current solution."  This reads as if all the bindings are
> copied into each validation
> result, which doesn't appear to make sense.  How are the values copied into
> the result?

I have changed the prose, see the diff.

>
> "The values of sh:annotationProperty must not be from the SHACL namespace,
> to avoid clashes with variables that are already produced by other means."
> This does not prevent clashese with other variables.

I have removed that sub-sentence.


Diff: 
https://github.com/w3c/data-shapes/commit/71caba490f06b439c9cb8f85d615abdf07ca9871

I will respond to your comments on section 7 in a separate email.

Holger
Received on Monday, 28 November 2016 00:22:50 UTC