Re: Comments on FPWD from Holger Knublauch on 2015-09-03 (public-data-shapes-wg@w3.org from September 2015)

From: Holger Knublauch <holger@topquadrant.com>
Date: Thu, 3 Sep 2015 11:26:33 +1000
To: public-data-shapes-wg@w3.org
Message-ID: <55E7A1C9.3000105@topquadrant.com>
Hi Arthur,

many thanks for your feedback, especially during a vacation within 
retirement :)

On 9/2/2015 11:24, Arthur Ryman wrote:
> Major Issues

1. and 2. below are difficult for me to address. I need a decision from 
the WG on how to proceed. I believe your view point reflects your 
experience with Resource Shapes, where no extension mechanism existed. 
OTOH, Peter's feedback rather reflects his SHACL-SPARQL design, and he 
seems to want to downplay the role of other execution languages beside 
SPARQL. Between these two view points, there is the current technical 
design which relies on templates internally, even though this fact 
doesn't need to be known to users of the Core Language.

The previous rounds of this topic lead to the division of the spec into 
Core and Advanced parts. A completely different approach would be to use 
the spec as a way to build the formal foundations of SHACL according to 
the technical layering, basically starting with Templates and then 
enumerating the built-in templates. But this would only be practical if 
we had a Primer that addresses the big picture because many people would 
run away screaming. All this is not a trivial change that I can make 
easily, and it would likely push us back by several months. Requires 
discussion.

>
> 1. The document conflates the two distinct uses we are making of
> SPARQL. First, we are using SPARQL to define the semantics of the
> built-in constraints. Second, we are supporting SPARQL as an
> executable extension language. The document must make these two uses
> clear.

See above. I believe these two aspects are just two sides of the same coin.

>   I suggest that the document replace the phrase "executable
> language" with "extension language".

I don't like the term extension language, because SPARQL is not just an 
extension but an integral part of the language's definition. But Peter 
had similar comments so this requires further thought.

>   Also, the document should explain
> that the semantics of some features of SHACL is given via SPARQL, but
> implementations are free to use any technology as long as they produce
> the same results.

I believe this is already stated in 1.3 "any alternative bodies need to 
follow the same semantics as the SPARQL queries.". The whole division of 
the document into Core and Advanced parts was supposed to make clear 
that some engines may only support the Core and implement it in whatever 
way they want.

It would be best if you could send me a specific sentence that I can 
inject somewhere.

>
> 2. The document treats the built-in property constraints as templates.
> They are not conceptually templates - they are primitive, built-in
> capabilities. Templates are an advanced feature.

No, the built-in property constraints are not primitive, built-in 
capabilities. They are implemented using templates and work just like 
any other template defined by someone else. This design philosophy is a 
central strength of SHACL because it allows engines to work with a very 
simple and generic core implementation. But you are right that Templates 
are only introduced as advanced features and the fact that templates are 
used for the core doesn't need to be known to Core users. The first 
usage of the term "template" after the Overview is in section 6.4.

>   Furthermore, the
> syntax with which the built-in property constraints are invoked is
> different than the way template constraints are invoked. The built-ins
> are invoked via sh:property and sh:inverseProperty but templates are
> invoked via sh:constraint.

No, some built-ins are also invoked via sh:constraint (e.g. 
sh:ClosedShape, AndConstraint) and people can subclass 
sh:PropertyConstraint and then use these subclasses via sh:property.

>   I understand Holger's statement that one
> can implement the built-ins via templates but that is just an
> implementation detail and should not bleed through into the spec.
> Users should not be forced to understand templates in order to use the
> built-ins.

Where do users need to understand templates to use the built-ins?

>
> 3. The document is inconsistent in its use of the term "constraint".
> Sometimes it is used in a general way, and sometimes in a very
> specific way referring to elements of the SHACL vocabulary. I suggest
> that whenever the document uses constraint in a specific way that it
> replace "constraint" with the corresponding RDF term from the SHACL
> vocabulary.

This is tricky because the corresponding RDF term - sh:Constraint - 
basically never shows up anywhere. People either use native constraints 
(with a SPARQL query) or instantiate a template constraint, which is a 
subclass of sh:Constraint. So I am not sure how to fix this. Again, 
specific suggestions would be ideal.

>
> 4. The Overview is missing a key concept, namely that of a "shape
> schema" which is a set of shapes that refer to each other, i.e. any
> reference to a shape, e.g. via sh:valueShape, must resolve to a shape
> definition within the shape schema.

Ok. I have added a sentence

"Shape and constraint definitions are represented in RDF graphs called 
shapes graphs.

to introduce the term that is also used elsewhere.

>
> 5. The design for ClosedShape is inconsistent with the rest of SHACL.
> The document treats ClosedShape as a constraint, but it is actually a
> characteristic of the shape. It should be promoted to be a direct
> property of sh:Shape and not wrapped in a sh:constraint.

This is more than an editorial change. If you want to make this specific 
suggestion, please raise an ISSUE.

>
> 6. Why are templates themselves classes? This seems like a minimally
> useful way to inherit arguments. A subtemplate must provide a new
> executable body that overrides the one defined in the supertemplate.
> So what is the actual benefit of inheritance other than for arguments?
> Seems like unnecessary complexity.

The main reason why templates are classes is that their template calls 
are *instances* with an rdf:type triple. We would need to invent some 
other RDF syntax but using rdf:type was already used successfully in 
SPIN. Inheritance of arguments is one benefit. Another is the 
consistency that template definitions define their sh:arguments as 
constraints, making it possible to validate shapes graphs with a SHACL 
engine. Some template even define SPARQL constraints to enforce 
additional syntactical constraints (e.g. cannot have both sh:datatype 
and sh:valueClass).

Again, if you feel there is something wrong with this design, please 
file an ISSUE.

>
> 7. The description of the invocation API uses highly abbreviated
> pseudocode and has a lot of implicit context. I had to guess at what
> the pseudocode meant. The API needs to be made much more explicit.

I agree this is very compact and I welcome other specific proposals on 
how to write all this down. But if I want to be explicit, then I need 
the API for execution languages (your next point), because these depend 
on each other.

I have assigned myself a task to rewrite that section. It will need to 
become more precise, e.g. I am using triple patterns and SPARQL 
fragments in some places. How do other specs handle this? Maybe I should 
use real JavaScript or Java syntax, and define the methods that need to 
be implemented?

>
> 8. The API for execution languages was also fairly abbreviated. Why do
> we even need to define this? A useful API would require language
> bindings. I don't think we need to go there at this point. I suggest
> we drop the extension language API from the spec.

See above.

>
> 9. Why are we defining sh:Function? These are only invocable from an
> extension language, but extension languages have their own mechanisms
> for defining functions. I suggest we drop this from the spec.

In a new version of the spec, validation functions are used to define 
many property constraints, so they are now essential part of the spec. 
The SPARQL bodies of several built-ins also require functions (e.g. 
sh:valueCount). We also had several resolutions to support functions. If 
you think there is a problem, please raise an ISSUE. Of course they may 
appear irrelevant to someone focusing on the core language, but they are 
crucial for most SPARQL-based scenarios that we encounter in practice. 
Also note that even from JavaScript there could be a mechanism to invoke 
SHACL functions, e.g.

     var count = shacl.invoke(SH.valueCount, subject, predicate);

If the JavaScript implementation wraps a SPARQL engine that talks to the 
data stored in JSON-LD, this will work nicely. Likewise, we have 
functions that use JavaScript, called from SPARQL.

>
> 10. The document seems to be inconsistent about where sh:severity can
> be used. Example 32 shows sh:severity in a sh:property. The discussion
> of Validation Results implies sh:severity can only appear in native
> and template constraints. Is this because property constraints are
> treated as template constraints?

Yes. Property constraints are template constraints. But there are indeed 
two places where sh:severity can appear: at a sh:Constraint (instance) 
and at a template. The one at the instance has higher priority than the 
one at the template. sh:Error is used as default if both are missing.

>
> 11. The SHACL Vocabulary Reference defines templates for all the
> built-in property constraints, but this is not useful and it greatly
> expands the size of the vocabulary. I suggest we drop those terms and
> not continue to claim that the built-ins are actually templates. That
> is just an implementation detail.

But the spec is all about details. We cannot define the semantics 
without such level of detail. Yes it's verbose, but that (ref) document 
will be for geeky readers only - it's just a dump of the Turtle file anyway.

>
> Minor Issues
>
> There are numerous typos, grammar errors, stylistic errors etc.
> However, it is not productive to list them here. I'd be happy to edit
> the source.

This would be great. Please feel free to take a snapshot from master and 
send me an updated version that I can diff in.

Thanks!
Holger
Received on Thursday, 3 September 2015 01:27:12 UTC