Re: ISSUE-139: implementing (core) constraint components universally

My original message in this thread is mostly concerned with how to implement
constraint components universally.  There is also a short preamble on how
one can best describe how constraint components.  I'm going to defend both
of these points separately, but I'm going to start with the implementation
point as that is the bulk of my original message.


Right now constraint components have up to three different implementations -
one when they occur in a property constraint, one when they occur in an
inverse property constraint, and one when they occur in a node constraint.
This means that there are up to three different pieces of code for each
constraint component, each (hopefully) implementing the same functionality.
I view this as a poor setup - three different pieces of code that have to be
written and thus three places where the bugs can be introduced.

Having a single implementation of each constraint component would actually
reduce development costs.  Ideally, this single implementation would be as
simple as the ask validators that implement many constraint components.
Consider, for example, sh:minCount whose implementation should be very
little more than "HAVING ( COUNT (DISTINCT ?value) < ?minCount )".  However,
I can't figure out how to do this nicely because of limitations in SPARQL,
hence the solution with boilerplate.  However, even the boilerplate solution
has only one implementation of each constraint component, and here one is
definitely better than three and also better than two.


Describing all constraint components in a similar fashion is also desirable
to describing them differently.  Right now some constraint components, e.g.,
sh:class, are described using the notion of value nodes but others, e.g.,
sh:minCount, are described using focus nodes and predicates even when the
effect is the same as value nodes.  Regularizing the way that constraint
components are described would reduce the number of ways that errors can
creep into the document and also reduce the cognitive load on readers of the
document.  Describing constraint components in terms of value nodes also
better shows the commonalities amongst them.

It is currently possible to have a constraint component that works
completely differently when it is in a node constraint from when it is in a
property constraint and from when it is in an inverse property constraint.
Using the notion of value nodes produces a force against this divergence.


These issues arise from having all constraint components sit inside the
three different kinds of constratins and having each constraint component
being responsible for its own determination of value nodes.  There are
different approaches to SHACL that would eliminate these issues.  ShEx has a
single property-crossing construct and all other constructs in triple
expressions are not concerned with properties.  OWL has several
property-crossing constructs but most constructs in OWL work on individual
value nodes.  My refactored SHACL syntax has a single property-crossing
construct and all constructs work on sets of value nodes.

peter


On 06/02/2016 10:13 PM, Holger Knublauch wrote:
> Could you help me understand why we should do this? All I am seeing is that
> this would add complexity to the language, add development costs for these
> additional cases, increase our burden to specify and write test cases for all
> these scenarios, for the "benefit" that people can apply entirely useless
> constructs such as minCount with node constraints or datatypes for subjects
> which can never be literals.
> 
> Furthermore, deleting the concept of sh:context makes it impossible for tools
> to determine under which conditions a constraint component should be offered.
> The forms that I have implemented would display every constraint property on
> every case - node constraints, property constraints, inverse property
> constraints. This is not user friendly!
> 
> Finally, every extension developer is forced to specify SPARQL queries for all
> cases, even if they make no sense (like most of the cases below). Some of the
> queries that you have written up are completely different from their other
> variations. How can you be sure that the same generalization is sensible for
> every possible future extension?
> 
> As a random example consider one of the original Use cases: specifying a
> primary key. These are only ever meant to be used for properties, neither
> inverses nor node constraints nor paths.
> 
> https://www.w3.org/TR/shacl-ucr/#uc25-primary-keys-with-uri-patterns
> 
> I must be missing something, but this is a massive step backwards and a
> serious risk to the success of SHACL. There is nothing broken right now with
> the context mechanism. Why change it?
> 
> Thanks,
> Holger
> 
> 
> On 3/06/2016 7:19, Peter F. Patel-Schneider wrote:
>> To think about how a constraint component works universally, it is
>> sufficient to think about value nodes, which are already defined at the
>> beginning of Section 4.
>>
>> So, sh:hasValue is then just that a value node is the given node and
>> sh:equals is just that the set of value nodes is the same as the set of
>> values for the focus node for the other property and sh:closed is just that
>> every value node has no values for disallowed properties and sh:minCount is
>> just that there are at least n value nodes.
>>
>>
>> Looking at https://github.com/TopQuadrant/shacl the changes to permit core
>> constraint components to be used universally appear to be as follows:
>>
>> 1/ Ensure that sh:context has all three relevant values for each constraint
>> component.  (Of course then sh:context becomes irrelevant and can be
>> removed.)
>>
>> 2/ For the constraint component for:
>>
>> sh:closed add
>>    sh:propertyValidator [
>>        rdf:type sh:SPARQLSelectValidator ;
>>        sh:message "Predicate {?unallowed} is not allowed on {?subject} (closed
>> shape)" ;
>>        sh:sparql """
>>         SELECT ?this (?val AS ?subject) ?unallowed ?object
>>         WHERE {
>>             {
>>                 FILTER ($closed) .
>>             }
>>             $this $predicate ?val .
>>             ?val ?unallowed ?object .
>>             FILTER (NOT EXISTS {
>>                 GRAPH $shapesGraph {
>>                     $currentShape sh:property/sh:predicate ?unallowed .
>>                 }
>>             } && (!bound($ignoredProperties) || NOT EXISTS {
>>                 GRAPH $shapesGraph {
>>                     $ignoredProperties rdf:rest*/rdf:first ?unallowed .
>>                 }
>>             }))
>>         }
>> """ ;
>> Similar for inverse property constraint.
>> sh:closed should also be implementable using the simple form (like
>> sh:datatype and sh:minExclusive are).
>>
>> sh:datatype    add dash:hasDatatype as a value for sh:inversePropertyValidator
>> sh:datatypeIn    add dash:hasDatatypeIn as a value for
>> sh:inversePropertyValidator
>>
>> sh:hasValue    add
>>    sh:nodeValidator [
>>        rdf:type sh:SPARQLSelectValidator ;
>>        sh:message "Node is not value {$hasValue}" ;
>>        sh:sparql """
>>         SELECT $this
>>         WHERE {
>>             FILTER { NOT sameTerm($this,$hasValue) }
>>         }
>>         """ ;
>>      ] ;
>>
>> sh:disjoint add
>>    sh:inversePropertyValidator [
>>        rdf:type sh:SPARQLSelectValidator ;
>>        sh:message "Inverse of property must not share any values with
>> {$disjoint}" ;
>>        sh:sparql """
>>         SELECT $this ($this AS ?object) $predicate ?subject
>>         WHERE {
>>             ?subject $predicate $this .
>>             ?subject $disjoint $this  .
>>         }
>>         """ ;
>>      ] ;
>>    sh:nodeValidator [
>>        rdf:type sh:SPARQLSelectValidator ;
>>        sh:message "Node must not be a value of {$disjoint}" ;
>>        sh:sparql """
>>         SELECT $this
>>         WHERE {
>>             $this $disjoint ?this .
>>         }
>>         """ ;
>>      ] ;
>>
>> sh:equals add
>>    sh:inversePropertyValidator [
>>        rdf:type sh:SPARQLSelectValidator ;
>>        sh:message "Inverse of property must have same values as {$equals}" ;
>>        sh:sparql """
>>         SELECT $this ($this AS ?object) $predicate ?subject
>>         WHERE {
>>             {
>>                 ?subject $predicate $this .
>>                 FILTER NOT EXISTS {
>>                     ?subject $equals $this  .
>>                 }
>>             }
>>             UNION
>>             {
>>                 ?subject $equals $this .
>>                 FILTER NOT EXISTS {
>>                     ?subject $predicate $this .
>>                 }
>>             }
>>         }
>>         """ ;
>>      ] ;
>>    sh:nodeValidator [
>>        rdf:type sh:SPARQLSelectValidator ;
>>        sh:message "Node must be a value of {$equals}" ;
>>        sh:sparql """
>>         SELECT $this
>>         WHERE {
>>             FILTER NOT EXISTS { $this $disjoint $this }
>>         }
>>         """ ;
>>      ] ;
>>
>> sh:lessThan add
>>    sh:InversePropertyValidator [
>>        rdf:type sh:SPARQLSelectValidator ;
>>        sh:message "Inverse property value is not < value of {$lessThan}" ;
>>        sh:sparql """
>>         SELECT $this ($this AS ?object) $predicate ?subject
>>         WHERE {
>>             ?subject $predicate $this  .
>>               $this $lessThan ?object2  .
>>             FILTER (!(?subject < ?object2)) .
>>         }
>>         """ ;
>>      ] ;
>>    sh:nodeValidator [
>>        rdf:type sh:SPARQLSelectValidator ;
>>        sh:message "Node is not < value of {$lessThan}" ;
>>        sh:sparql """
>>         SELECT $this
>>         WHERE {
>>             $this $lessThan ?object2 .
>>             FILTER (!(?this < ?object2)) .
>>         }
>>         """ ;
>>      ] ;
>>
>> sh:lessThanOrEquals similar
>>
>> sh:minCount add
>>    sh:nodeValidator [
>>        rdf:type sh:SPARQLSelectValidator ;
>>        sh:message "Node is precisely one value, not {$minCount}" ;
>>        sh:sparql """
>>         SELECT $this
>>         WHERE {
>>             FILTER ( 1 >= $minCount) .
>>         }
>>         """ ;
>>      ] ;
>>
>> sh:maxCount similar
>>
>> sh:maxExclusive    add dash:hasMaxExclusive as a value for
>> sh:inversePropertyValidator
>>
>> sh:maxInclusive    add dash:hasMaxInclusive as a value for
>> sh:inversePropertyValidator
>>
>> sh:minExclusive    add dash:hasMinExclusive as a value for
>> sh:inversePropertyValidator
>>
>> sh:minInclusive    add dash:hasMinInclusive as a value for
>> sh:inversePropertyValidator
>>
>> sh:uniqueLang add
>>    sh:inversePropertyValidator [
>>        rdf:type sh:SPARQLSelectValidator ;
>>        sh:message "Language {?lang} used more than once" ;
>>        sh:sparql """
>>         SELECT DISTINCT $this ($this AS ?object) $predicate ?lang
>>         WHERE {
>>             {
>>                 FILTER ($uniqueLang) .
>>             }
>>             ?value $predicate $this .
>>             BIND (lang(?value) AS ?lang) .
>>             FILTER (bound(?lang) && ?lang != \"\") .
>>             FILTER EXISTS {
>>                 $this $predicate ?otherValue .
>>                 FILTER (?otherValue != ?value && ?lang = lang(?otherValue)) .
>>             }
>>         }
>>         """ ;
>>      ] ;
>>    sh:nodeValidator [
>>        rdf:type sh:SPARQLSelectValidator ;
>>        sh:message "A language used more than once on node" ;
>>        sh:sparql """
>>         SELECT $this
>>         WHERE { FILTER ( 1 = 0 )
>>         }
>>         """ ;
>>      ] ;
>>
>> sh:qualifiedMinCount add
>>    sh:nodeValidator [
>>        rdf:type sh:SPARQLSelectValidator ;
>>        sh:sparql """
>>         SELECT $this ($this AS ?subject) $predicate ?count ?failure
>>         WHERE {
>>             BIND (sh:hasShape(?subject, $valueShape, $shapesGraph) AS
>> ?hasShape) .
>>             BIND (!bound(?hasShape) AS ?failure) .
>>             FILTER IF(?failure, true, ?count > IF(?hasShape,1,0))
>>         }
>> """ ;
>>      ] ;
>>
>> sh:qualifiedMaxCount similar
>>
>>
>> Note that none of these are difficult to do, particularly when looking at
>> the another validator for the same component.  This should be true for any
>> constraint component that can be described as working on the value nodes.  I
>> think that all constraint components should be describable this way.
>>
>>
>> peter
>>
> 
> 

Received on Friday, 3 June 2016 14:12:33 UTC