- From: Holger Knublauch <holger@topquadrant.com>
- Date: Thu, 29 Oct 2015 15:29:31 +1000
- To: public-data-shapes-wg@w3.org
On 10/29/2015 14:14, Peter F. Patel-Schneider wrote: > Here is my proposal for a better simplification, based on changes to Part 2 of > http://w3c.github.io/data-shapes/shacl/index-ISSUE-95.html. The basic idea > is to > 1/ get rid of template injection, instead using supertemplates, which I think > can do everything needed; > 2/ get rid of abstract classes; > 3/ get rid of validation functions, as they are not needed; and > 4/ get rid of functions, as there is no way to call them. > > > > Part 2 of the SHCAL spec is very hard to read. It has quite a few undefined > terms, and depends on several very difficult-to-do operations. It is > insufficiently specific in many places. I agree this requires more editorial work (and thanks for reading the details nevertheless!). I can certainly elaborate on this whole part, but was waiting to have a final design first. > > I here propose several changes the normative bits of the version of Part 2 > that was prepared for ISSUE-95, > http://w3c.github.io/data-shapes/shacl/index-ISSUE-95.html, that fix many of > the problems there. I have not proposed changes for any of the bits marked > non-normative or any of the examples. > > > 6.2 > > The SPARQL queries linked to a constraint via sh:sparql must be string > literals that can be parsed into legal SPARQL 1.1 queries of the query form > SELECT. > -> > The values for sh:sparql must be either the empty string literal ("") or > string literals that are can be parsed into legal SPARQL 1.1 queries of the > query form SELECT. The empty string literal indicates a vacuous constraint, > i.e., one that never produces any violations. Could you clarify why the "" literals are needed? Why not use no value instead? > > SHACL also includes a more general superclass sh:Template that may be used > for other kinds of templates (rules, stored queries etc). Well-defined, > non-abstract templates must provide at least one body using a property such > as sh:sparql. > -> > Well-defined templates must provide at least one body using a property such > as sh:sparql. Why drop sh:Template? It is already used as shared superclass of sh:ConstraintTemplate and sh:ScopeTemplate, plus we have various other types of templates in production and having a shared superclass streamlines the infrastructure to manage them. > > 7.4 > > [All of 7.4] > -> > It is sometimes desirable to mix multiple templates so that > they can be used within the same constraint. This is done by making a > template class that is a subclass of multiple other template classes. An > instance of the child template class then combines the effects of all these > templates, because it is an instance of them all. You state above that you want to get rid of abstract superclasses. In the current TTL file, there is a class sh:AbstractPropertyConstraint which defines the argument sh:predicate once and for all its subclasses. Similarly, there is now a shared superclass for the two templates defining sh:qualifiedMinCount and sh:qualifiedMaxCount, to define their shared argument sh:qualifiedValueShape. How would your design handle these cases? Also, why would anyone want to instantiate something like sh:AbstractDatatypePropertyConstraint directly? The language only really supports instantiating sh:PropertyConstraint. The reason for me to introduce template injection was to be able to distinguish the "inheritance" of arguments from "merging" them into a single node. My previous design was mixing those two aspects, blurring the lines between those two use cases. While not completely out of the question, a major problem that I ran into was the treatment of optional arguments. Example of this include sh:ignoredProperties (from closed shapes) and sh:flags (from sh:pattern). If we only have a single mechanism, then a superclass such as sh:PropertyConstraint would "inherit" all properties such as sh:minCount and sh:pattern as non-optional. Do you have a better solution to this? > > 7.6 > > If a sh:PropertyValueConstraintTemplate has a value for > sh:validationFunction, ... [to end of section] > -> > [empty] My mistake: The branch mentions sh:validationFunction, but that is a left-over that needs to be replaced. It should check for instance-of sh:NodeValidationFunction instead. However, you seem to want to delete the whole mechanism of using functions here. The problem that this design was supposed to address is that we otherwise need to introduce many duplicate templates, e.g. to share the functionality of sh:class between sh:property and sh:inverseProperty. In some cases, we would need to define four templates, while only a single function would be needed. Having worked with the former approach for too long (and countless copy-and-pastes later), I really want to move to functions, and I am confident that other users of advanced SHACL will share this sentiment. The functions have the additional benefit that they only need to be of the ASK format, reducing the boilerplate of the SELECT clauses. > > > 8 > > All this is analogous to how constraints work, but with > the additional restrictions: > * All subjects of sh:scope triples must be IRIs > * The arguments of a scope template must not be blank nodes > -> > All this is analogous to how constraints work. I had introduced those two clauses for a reason, when I implemented the SPARQL code generation. This is complicating. If the subjects of sh:scope triples are blank nodes, then it becomes impossible to generate SPARQL code that "points" at the scope declaration. As far as I remember, the problem was that each scope essentially becomes a nested SELECT DISTINCT clause. Due to the inside-out-evaluation policy of SPARQL, it is becomes impossible to pass pre-bound variables into such clauses, especially not blank nodes (see second bullet item above). So my work-around was to rely on property functions (magic properties) that I defined for Jena to produce the bindings, passing in the scope shape as a URI. Do you have examples where scope template arguments must be blank nodes? Do you have arguments for blank nodes as subjects of sh:scope? Although I could understand why conceptually such things should not matter, I believe allowing either will vastly complicate the implementation of this feature. It would be good to have a second implementation look into this problem space to confirm or reject the problems that I have encountered. If someone has a better solution then I am happy to change my view point. Meanwhile, I'd suggest to stay conservative with an approach that is better under control. > > 8.1 > > The SPARQL queries linked to a scope via sh:sparql must be of the query form > SELECT, or a fragment that produces a valid SELECT query if wrapped by > SELECT ?this WHERE { ... }. The SELECT queries must project to the result > variable ?this. > The SELECT queries must also be executable when converted to an ASK query > and with a pre-bound value for ?this. The set of bindings for ?this that > return true for such ASK queries must be identical to the set produced by > the SELECT query. This constraint makes sure that engines can validate > whether a given shape applies to a given focus node as part of the > validateNode operation. > -> > The SPARQL queries linked to a scope via sh:sparql must be of the query form > SELECT ?this WHERE { ... }. Ok, I could live without allowing the fragments, for simplification purposes. The reason for the second paragraph (on the pre-bound variable for ?this) is the validation of individual nodes. For example, when someone has a shape with a custom scope and you have ex:MyInstance, then the algorithm to determine whether the shape applies to the instance can be much more efficient than having to evaluate the whole scope and check whether the result set contains ex:MyInstance. The latter would become prohibitively slow for large databases. Do you have examples of scopes where that restriction would be an obstacle? The (few) examples of custom scopes that I have seen were easily convertible into ASK queries without changing the WHERE clause. An alternative design to dropping this bidirectionalism would be to have an optional second property sh:inverseSPARQL that can be put to a scope for those cases where the original scope query cannot be converted to ASK. I would be OK with that. > > > 9 > > [Remove entirely, as there is no defined way to call functions.] This is neither true nor helpful. SHACL functions can be called from every SPARQL query (e.g. constraint or scope). Regardless of whether we keep sh:NodeValidationFunctions, the general mechanism has proven to be extremely successful in SPIN, leading to vastly more compact and better maintainable SPARQL queries. The fact that sh:NodeValidationFunctions are also normal SPARQL functions means that the business logic can be reused in multiple places. There are approved requirements for functions, even "concise language" falls into that category. > > Well-defined, non-abstract functions must provide at least one > body property such as sh:sparql. > -> > Well-defined functions must provide at least one > body property such as sh:sparql. > > 9.4 > > [Remove, if implementations want to analyze functions to see if they are > chachable they are free to do so.] I could certainly live with making this a TopBraid-only feature and I will not fight for it. We did however have some use cases where this analysis is not easily possible. Examples are queries against read-only graphs with background data. How would an engine determine that all ?x in GRAPH ?x { ... } are read-only graphs? > > 9.5 > > [Remove, as implementing this will require parsing and modifying SPARQL > bodies.] What modifications are required? I had explained how these are invoked in 7.6 Thanks, Holger
Received on Thursday, 29 October 2015 05:30:10 UTC