Re: nomenclature in current document - ISSUE-65 from Holger Knublauch on 2015-08-14 (public-data-shapes-wg@w3.org from August 2015)

From: Holger Knublauch <holger@topquadrant.com>
Date: Fri, 14 Aug 2015 16:26:00 +1000
To: RDF Data Shapes Working Group <public-data-shapes-wg@w3.org>
Message-ID: <55CD89F8.7020502@topquadrant.com>
Thanks, these comments are at least actionable, even if it's just (in 
your own words) "lipstick on a pig". I was looking at your original 
message on this ISSUE and believe that most had already been addressed 
one way or another, so it's good to have an updated version of your 
complaints :)

Comments below.

On 8/14/2015 11:01, Peter F. Patel-Schneider wrote:
> This is my rational reconstruction of the current nomenclature and basic
> validation control process as described in the beginning of the current
> document.  This reconstruction makes significant changes from the document
> that I feel are needed to make any sense out of the description in the
> document.   This reconstruction doesn't even correspond to my preferred
> setup, which involves further changes away from the current document.
>
> I have included some quotes from the current document and commentary on
> them.  Most of the quotes are taken from Section 1 of the current document.
> I expect that there are many other places where questionable technical
> wording occurs in the document.

I have gone through the document with your comments in mind and tried to 
make the use of terminology more consistent.

>
> It took considerable effort to dig through even just the beginning of the
> document and come up with this reconstruction.  I think that this level of
> required effort means that readers outside of the working group are going to
> have a very hard time understanding what SHACL is supposed to be.
>
>
>
>
> Shape:
> - RDF node belonging to sh:Shape
> - group (zero or more) constraints
> - can (must?) have one (or more?) scopes

The answer is "can". It is perfectly plausible to define shapes that are 
not used in one graph, but may be used by other data graphs and reused 
by other shapes graphs, e.g. via sh:valueShape and AND/OR.

> - can have one (or more?) filters (which are shapes)

As above, the answer to this is in the details (e.g. the turtle file). 
Can have any number of filters.

>
> Constraint:
> - RDF node belonging to sh:Constraint
> - constraints can have one (or more?) filters (which are shapes)
> - constraints are validated? ["evaluated" in document] against nodes in an
>    RDF graph (or dataset)

I have now switched to "validated" consistently.

> - validation of constraints checks for the presence or absence of certain
>    information in an RDF graph (in a dataset)
> - validation of constraints returns? ["produces" and "reports" in document]
>    constraint results, including informational results, warnings, and errors

I prefer "produce" over "returns" and have replaced "reports" with it.

>
> Shapes in Constraints:
> - constraint validation can recursively validate? ["match" in document]

I have replaced "match" with corresponding prose about validation 
results, or simply "have" in non-normative paragraphs.

>    shapes named in them against nodes in the graph
>
> Control
> - validate a constraint against a node in an RDF graph (in a dataset)
>    - if the node passes (all/any of?) the shape's filter(s)

Must be *all* filters.

>      then validate the rest of the constraint against the node (which depends
>        on what kind of a constraint it is)
>    - validation fails if an error result is returned and succeeds otherwise
> - validate a shape in a shape graph against a node in an RDF graph (in a
> dataset)
>    - if the node satisfies (all/any of?) the shape's scope(s) and
>         the node passes (all/any of?) the shape's filter(s)
*Any* of the scopes, i.e. they become a UNION.
>      then validate each of the shape's constraints against that node
>    - validation fails if any of the constraints that are validated return an
>      error result and succeeds otherwise
> - validate a shape in a shape graph against an RDF data graph (in a dataset)
>    - validate the shape against all nodes in the RDF data graph
> - validate a shape graph against an RDF data graph (in a dataset)
>    - validate each shape in the shape graph against the data graph
>
> Scopes:
> - individual scopes are satisfied precisrely by a single node
> - class scopes are satisfied precisely by the instances ? of the class and its
>    subclasses ?

Yes, why is this unclear? Anything to improve?

>
> Filters:
> - a node passes a filter shape precisely when the validating the shape
>    against the node succeeds
>
>
> Possible Holes (aside from parts marked with ? above)
> - handling of recursive loops

This is a well-known hole represented by open ISSUEs and clearly marked 
as such in the document.

> - validating a shape with no scopes

If a Shape has no scope, and is not referenced via sh:nodeShape or 
rdf:type, then it will not be used for validation.

> - handling blank node constraints and shapes with no typing

This is explained in the document, e.g.

"When used as values of sh:property, property constraints represented as 
blank nodes do not require an rdf:type triple."

In the Turtle file these places are marked with sh:defaultValueType 
which you seem to object to in ISSUE-70. Depending on the resolution of 
ISSUE-70, I can clarify and formalize this further.

> - graph-level shapes - how is the graph resource recognized

There are no graph-level shapes in my draft. But users may attach shapes 
at the rdfs:Resource that has the same URI as the graph. No need to 
recognize those - they are just normal resources.

> - global constraints - violate much of the above nomenclature

There are no global constraints in my draft. Everything is scoped, and 
global property constraints can for example be expressed via 
sh:PropertyScope. I had added a corresponding example to the 
(informative) section 7.1. By dropping the concept of global 
constraints/shapes, the model and terminology becomes simpler.

>
>
>
> Quotes and comments:
>
>
> "Each constraint defines a condition that can be validated against a graph."
> - wrong - a constraint by itself cannot be validated against anything
>     - a constraint is not (generally) validated against a graph at all

Ok, changed to "a node in a graph"? Better suggestions?

"Each constraint defines a condition that can be validated against a 
node in a graph."

>
> "A shape describes a group of constraints with the same focus node."
> - wrong - constraints don't have focus nodes in this way

I am open to a better proposal, but I have for now replaced this with

"...constraints that should be validated against the same focus nodes"

>
> "allowing some shapes to further narrow down the constraints from other shapes"
> - unfounded - there is no notion of narrowing down constraints

Ok, I have clarified that this is about shape classes only:

"Such shape classes may be arranged in a specialization hierarchy, 
effectively allowing some shape classes to further narrow down the 
constraints from other shape classes."


>
> "validate all nodes in a given graph"
> - questionable - needs shapes as well as data

Ok, changed to

"Another supported operation is to validate a given graph using all 
applicable shapes."

>
> "output of constraint validation is a set of constraint violations"
> - wrong - can also result in warnings and informational results

Ok, changed to

"is a set of violation results"

(I have also switched from "Violations" to "Validation Results" 
everywhere else unless it was really about violations (Warnings and Errors))

>
> "a given RDF node matches a given shape"
> - unfounded - match is never defined

Ok, changed to

"One of the operations that SHACL engines should support validates a 
given RDF node against a given shape, producing validation results, 
including informational messages, warnings and errors."

>
> "that all focus nodes need to fulfill before they are evaluated"
> - unfounded - no notion of evaluation for nodes

Ok, replaced "evaluated" with "validated".

>
> "where the object does not match the shape specified by sh:valueShape"
> - unfounded - no notion of matching for shapes

Ok, changed to

"where validating the object against the shape specified by 
sh:valueShape produces any error-level constraint violations"

>
> "A violation must be reported"
> - unused - nothing depends on reported violations
>
There are many places where this phrase is used, but where it occurs I 
believe the use is perfectly adequate: It is used in the textual 
definitions of the various core language elements, and these definitions 
should be sufficiently detailed to guide users and developers. An engine 
that conforms to SHACL *MUST* report these results.


All my changes are now online - you can see a detailed diff of changes here:

https://github.com/w3c/data-shapes/commit/3cbf292ec25e3f85717d5b8058410ce6f7a7b689

In an attempt to represent your bullet-style compact summary of SHACL, I 
have added an experimental Glossary into the Appendix:

http://w3c.github.io/data-shapes/shacl/#terms

Let me know if this is helpful or not. It may double as a summary of 
content that is otherwise spread across the rest of the document.

I have not yet gone through the shacl-ref document in case there are 
terminological inconsistencies. I cannot judge what amount of further 
"nitpicking" will be required for the FPWD. I hope we can proceed now.

Holger
Received on Friday, 14 August 2015 06:28:14 UTC