- From: Holger Knublauch <holger@topquadrant.com>
- Date: Thu, 3 Sep 2015 11:26:33 +1000
- To: public-data-shapes-wg@w3.org
- Message-ID: <55E7A1C9.3000105@topquadrant.com>
Hi Arthur,
many thanks for your feedback, especially during a vacation within
retirement :)
On 9/2/2015 11:24, Arthur Ryman wrote:
> Major Issues
1. and 2. below are difficult for me to address. I need a decision from
the WG on how to proceed. I believe your view point reflects your
experience with Resource Shapes, where no extension mechanism existed.
OTOH, Peter's feedback rather reflects his SHACL-SPARQL design, and he
seems to want to downplay the role of other execution languages beside
SPARQL. Between these two view points, there is the current technical
design which relies on templates internally, even though this fact
doesn't need to be known to users of the Core Language.
The previous rounds of this topic lead to the division of the spec into
Core and Advanced parts. A completely different approach would be to use
the spec as a way to build the formal foundations of SHACL according to
the technical layering, basically starting with Templates and then
enumerating the built-in templates. But this would only be practical if
we had a Primer that addresses the big picture because many people would
run away screaming. All this is not a trivial change that I can make
easily, and it would likely push us back by several months. Requires
discussion.
>
> 1. The document conflates the two distinct uses we are making of
> SPARQL. First, we are using SPARQL to define the semantics of the
> built-in constraints. Second, we are supporting SPARQL as an
> executable extension language. The document must make these two uses
> clear.
See above. I believe these two aspects are just two sides of the same coin.
> I suggest that the document replace the phrase "executable
> language" with "extension language".
I don't like the term extension language, because SPARQL is not just an
extension but an integral part of the language's definition. But Peter
had similar comments so this requires further thought.
> Also, the document should explain
> that the semantics of some features of SHACL is given via SPARQL, but
> implementations are free to use any technology as long as they produce
> the same results.
I believe this is already stated in 1.3 "any alternative bodies need to
follow the same semantics as the SPARQL queries.". The whole division of
the document into Core and Advanced parts was supposed to make clear
that some engines may only support the Core and implement it in whatever
way they want.
It would be best if you could send me a specific sentence that I can
inject somewhere.
>
> 2. The document treats the built-in property constraints as templates.
> They are not conceptually templates - they are primitive, built-in
> capabilities. Templates are an advanced feature.
No, the built-in property constraints are not primitive, built-in
capabilities. They are implemented using templates and work just like
any other template defined by someone else. This design philosophy is a
central strength of SHACL because it allows engines to work with a very
simple and generic core implementation. But you are right that Templates
are only introduced as advanced features and the fact that templates are
used for the core doesn't need to be known to Core users. The first
usage of the term "template" after the Overview is in section 6.4.
> Furthermore, the
> syntax with which the built-in property constraints are invoked is
> different than the way template constraints are invoked. The built-ins
> are invoked via sh:property and sh:inverseProperty but templates are
> invoked via sh:constraint.
No, some built-ins are also invoked via sh:constraint (e.g.
sh:ClosedShape, AndConstraint) and people can subclass
sh:PropertyConstraint and then use these subclasses via sh:property.
> I understand Holger's statement that one
> can implement the built-ins via templates but that is just an
> implementation detail and should not bleed through into the spec.
> Users should not be forced to understand templates in order to use the
> built-ins.
Where do users need to understand templates to use the built-ins?
>
> 3. The document is inconsistent in its use of the term "constraint".
> Sometimes it is used in a general way, and sometimes in a very
> specific way referring to elements of the SHACL vocabulary. I suggest
> that whenever the document uses constraint in a specific way that it
> replace "constraint" with the corresponding RDF term from the SHACL
> vocabulary.
This is tricky because the corresponding RDF term - sh:Constraint -
basically never shows up anywhere. People either use native constraints
(with a SPARQL query) or instantiate a template constraint, which is a
subclass of sh:Constraint. So I am not sure how to fix this. Again,
specific suggestions would be ideal.
>
> 4. The Overview is missing a key concept, namely that of a "shape
> schema" which is a set of shapes that refer to each other, i.e. any
> reference to a shape, e.g. via sh:valueShape, must resolve to a shape
> definition within the shape schema.
Ok. I have added a sentence
"Shape and constraint definitions are represented in RDF graphs called
shapes graphs.
to introduce the term that is also used elsewhere.
>
> 5. The design for ClosedShape is inconsistent with the rest of SHACL.
> The document treats ClosedShape as a constraint, but it is actually a
> characteristic of the shape. It should be promoted to be a direct
> property of sh:Shape and not wrapped in a sh:constraint.
This is more than an editorial change. If you want to make this specific
suggestion, please raise an ISSUE.
>
> 6. Why are templates themselves classes? This seems like a minimally
> useful way to inherit arguments. A subtemplate must provide a new
> executable body that overrides the one defined in the supertemplate.
> So what is the actual benefit of inheritance other than for arguments?
> Seems like unnecessary complexity.
The main reason why templates are classes is that their template calls
are *instances* with an rdf:type triple. We would need to invent some
other RDF syntax but using rdf:type was already used successfully in
SPIN. Inheritance of arguments is one benefit. Another is the
consistency that template definitions define their sh:arguments as
constraints, making it possible to validate shapes graphs with a SHACL
engine. Some template even define SPARQL constraints to enforce
additional syntactical constraints (e.g. cannot have both sh:datatype
and sh:valueClass).
Again, if you feel there is something wrong with this design, please
file an ISSUE.
>
> 7. The description of the invocation API uses highly abbreviated
> pseudocode and has a lot of implicit context. I had to guess at what
> the pseudocode meant. The API needs to be made much more explicit.
I agree this is very compact and I welcome other specific proposals on
how to write all this down. But if I want to be explicit, then I need
the API for execution languages (your next point), because these depend
on each other.
I have assigned myself a task to rewrite that section. It will need to
become more precise, e.g. I am using triple patterns and SPARQL
fragments in some places. How do other specs handle this? Maybe I should
use real JavaScript or Java syntax, and define the methods that need to
be implemented?
>
> 8. The API for execution languages was also fairly abbreviated. Why do
> we even need to define this? A useful API would require language
> bindings. I don't think we need to go there at this point. I suggest
> we drop the extension language API from the spec.
See above.
>
> 9. Why are we defining sh:Function? These are only invocable from an
> extension language, but extension languages have their own mechanisms
> for defining functions. I suggest we drop this from the spec.
In a new version of the spec, validation functions are used to define
many property constraints, so they are now essential part of the spec.
The SPARQL bodies of several built-ins also require functions (e.g.
sh:valueCount). We also had several resolutions to support functions. If
you think there is a problem, please raise an ISSUE. Of course they may
appear irrelevant to someone focusing on the core language, but they are
crucial for most SPARQL-based scenarios that we encounter in practice.
Also note that even from JavaScript there could be a mechanism to invoke
SHACL functions, e.g.
var count = shacl.invoke(SH.valueCount, subject, predicate);
If the JavaScript implementation wraps a SPARQL engine that talks to the
data stored in JSON-LD, this will work nicely. Likewise, we have
functions that use JavaScript, called from SPARQL.
>
> 10. The document seems to be inconsistent about where sh:severity can
> be used. Example 32 shows sh:severity in a sh:property. The discussion
> of Validation Results implies sh:severity can only appear in native
> and template constraints. Is this because property constraints are
> treated as template constraints?
Yes. Property constraints are template constraints. But there are indeed
two places where sh:severity can appear: at a sh:Constraint (instance)
and at a template. The one at the instance has higher priority than the
one at the template. sh:Error is used as default if both are missing.
>
> 11. The SHACL Vocabulary Reference defines templates for all the
> built-in property constraints, but this is not useful and it greatly
> expands the size of the vocabulary. I suggest we drop those terms and
> not continue to claim that the built-ins are actually templates. That
> is just an implementation detail.
But the spec is all about details. We cannot define the semantics
without such level of detail. Yes it's verbose, but that (ref) document
will be for geeky readers only - it's just a dump of the Turtle file anyway.
>
> Minor Issues
>
> There are numerous typos, grammar errors, stylistic errors etc.
> However, it is not productive to list them here. I'd be happy to edit
> the source.
This would be great. Please feel free to take a snapshot from master and
send me an updated version that I can diff in.
Thanks!
Holger
Received on Thursday, 3 September 2015 01:27:12 UTC