Re: shapes-ISSUE-79 (Validation functions): Cleaner separation between value checking and property iteration [SHACL Spec] from Simon Steyskal on 2015-08-13 (public-data-shapes-wg@w3.org from August 2015)

From: Simon Steyskal <simon.steyskal@wu.ac.at>
Date: Thu, 13 Aug 2015 07:46:48 +0200
To: Holger Knublauch <holger@topquadrant.com>
Cc: RDF Data Shapes Working Group <public-data-shapes-wg@w3.org>
Message-ID: <028a1bcec8ad644deddce871a6c59d05@wu.ac.at>

Hi!

At a first glance I definitly see the benefits of your proposed 
approach. However, I've two questions:

1) You've implemented those sh:ValidationFunctions as ASK queries 
instead of SELECTs -> don't we lose important information for creating 
violation messages (e.g. this (?this AS ?subject) ?predicate ?object 
?datatype).

2) Is there a particular reason why 
"sh:AbstractArgumentMaxCountConstraint" isn't using such a validation 
function? (Probably because you just wanted to exemplify the approach on 
a handful of examples?)

cheers,
simon

---
DDipl.-Ing. Simon Steyskal
Institute for Information Business, WU Vienna

www: http://www.steyskal.info/  twitter: @simonsteys

Am 2015-08-13 04:29, schrieb RDF Data Shapes Working Group Issue 
Tracker:
> shapes-ISSUE-79 (Validation functions): Cleaner separation between
> value checking and property iteration [SHACL Spec]
> 
> http://www.w3.org/2014/data-shapes/track/issues/79
> 
> Raised by: Holger Knublauch
> On product: SHACL Spec
> 
> I was never quite happy with one aspect of how SHACL templates
> (including the Core templates) were internally defined. To illustrate
> the problem, this is how sh:datatype is currently defined:
> 
> 
> sh:AbstractDatatypePropertyConstraint
> 	a sh:ConstraintTemplate ;
> 	rdfs:subClassOf sh:AbstractPropertyConstraint ;
>         ...
> 	sh:message "Values must have datatype {?datatype}" ;
> 	sh:sparql """
> 		SELECT ?this (?this AS ?subject) ?predicate ?object ?datatype
> 		WHERE {
> 			?this ?predicate ?object .
> 			FILTER (!sh:hasDatatype(?object, ?datatype)) .
> 		}
> 		""" ;
> .
> 
> The SPARQL query above does two things:
> a) It iterates over all property values
> b) It FILTERs each of these values.
> 
> This is a recurring pattern for many property constraints. I would
> like us to make this design pattern more explicit so that it becomes
> the following:
> 
> sh:AbstractDatatypePropertyConstraint
> 	a sh:PropertyValueConstraintTemplate ;
> 	rdfs:subClassOf sh:AbstractPropertyConstraint ;
>         ...
> 	sh:message "Values must have datatype {?datatype}" ;
> 	sh:validationFunction sh:hasDatatype ;
> .
> 
> Instead of pointing at a SPARQL query that does everything, such
> constraints only point at a Function, which must take a value as input
> and return a boolean. The engine can produce the surrounding SPARQL
> automatically, and can even directly inject the body of the
> sh:hasDatatype function.
> 
> This design has the following advantages:
> - It makes the contract more explicit, modular and arguably cleaner
> - It allows to focus on what really matters, reducing clutter
> - It makes it easier to reuse the logic, especially for inverse
> property constraints (and sh:Arguments)
> - It makes it easier to optimize execution - if only a boolean result
> is needed, then a code generator can more easily combine them into a
> single FILTER such as sh:hasDatatype(...) && sh:hasNodeKind(...)
> - It lowers the implementation costs for other languages like
> JavaScript - these can focus on implementing the functions
> - It raises the abstraction, e.g. in JavaScript these checks can be
> simple variable comparisons, regardless of how the surrounding
> iteration happened
> - The key snippets are also reusable inside of SPARQL expressions,
> because they are also SPARQL functions.
> 
> The only disadvantage that I can think of is that there is a little
> bit more work for the engine implementers, because this requires a
> couple of new classes and properties to work correctly. However, I'd
> rather push the complexity to the engine developers and have a cleaner
> overall design for the end users. And, nobody is forced to use this
> new pattern - anyone can still use the current mechanism using
> sh:sparql.
> 
> I have implemented this on a test branch, and a new Turtle file can be
> found here:
> 
> https://github.com/w3c/data-shapes/blob/ISSUE-79/shacl/shacl.shacl.ttl
> 
> I have not yet updated the textual documents - I'd love to hear from
> others on the general direction before I spend more time on this.

Received on Thursday, 13 August 2015 05:47:16 UTC