Re: eliminating the need for three SPARQL queries for constraint components from Holger Knublauch on 2016-06-03 (public-data-shapes-wg@w3.org from June 2016)

From: Holger Knublauch <holger@topquadrant.com>
Date: Fri, 3 Jun 2016 10:29:40 +1000
To: public-data-shapes-wg@w3.org
Message-ID: <d9540373-a8dd-16cf-fc0c-5c7b0f5d93bf@topquadrant.com>
I believe if we introduce paths (see my other email today), most cases 
would be covered by at most two queries which is as good as it can get. 
This is IMHO far better than trying to over-generalize and force 
everything to go through the same code. While 
property/inverseProperty/path queries are usually almost identical, 
there are many cases where the differences between property and node 
constraints are very large. For example why first iterate over all 
values of the property for sh:hasValue, where you could otherwise just 
do a direct look up. If we over-align all these cases, users would not 
even be able to specify such optimizations. Needless to say, the $this 
$predicate ?value loop would not even work for cases where values are 
absent.

The general problem here and in the related ISSUE-139 discussion is that 
we only seem to be informed by the Core Vocabulary. There are plenty of 
other cases that do not fit into these schemes, and we are shooting 
ourselves into the foot if we disallow such cases now for some 
questionable "simplifications".

Holger




On 3/06/2016 9:54, Peter F. Patel-Schneider wrote:
> Instead of having potentially three different SPARQL queries for a
> constraint component, is it possible to have only one?  There are at least
> two problems to overcome.  First, SPARQL does not have all the facilities
> that one is used to having in a programming language, like compound values
> and subroutines.  Second, SHACL validation reports have a setup that is
> different for node, property, and inverse property constraints.  Third,
> SHACL messages in validation reports are sensitive to whether the constraint
> component is in a node, a property, or in inverse property constraint.
>
> One way of overcoming the first problemm is just to have some boilerplate
> that has to be included in each SPARQL body.  This boilerplate is
> responsible for setting up the correct environment for the code that
> implements the actual constraint component.
>
> A piece of boilerplate that does this is
>  
>  { $this $predicate ?value .
>    FILTER ( sameTerm(?context,sh:PropertyConstraint) )
>  } UNION {
>      ?value $predicate $this .
>    FILTER ( sameTerm(?context,sh:InversePropertyConstraint) )
>    } UNION {
>    BIND ( $this AS ?value )
>    FILTER ( sameTerm(?context,sh:NodeConstraint) )
>  }
>
> Another way of overcoming this problem is to use the VALUES meaning of
> pre-binding, so that the SPARQL bodies are just started with potentially
> multiple value for $value, namely all the value nodes for a particular focus
> node.  However, this requires a particular kind of pre-binding which may not
> be available in all SPARQL implementations so lets go with the first
> solution.  Also, I don't think that this works for all constraint
> components, in particular sh:equals.
>
>
> One way of overcoming the second problem is to have the biolerplate also set
> up the values for the validation reports, as in
>  
>  { $this $predicate ?value .
>    BIND ( $this AS ?subject )
>    BIND ( ?value AS ?object )
>    FILTER ( sameTerm(?context,sh:PropertyConstraint) )
>  } UNION {
>      ?value $predicate $this .
>    BIND ( ?value AS ?subject )
>    BIND ( $this AS ?object )
>    FILTER ( sameTerm(?context,sh:InversePropertyConstraint) )
>    } UNION {
>    BIND ( $this AS ?value )
>    FILTER ( sameTerm(?context,sh:NodeConstraint) )
>  }
>
> This generally works, but would require a change in validation reports for
> sh:closed.
>
> Another way of overcoming this problem is to change the validation reports,
> to eliminate sh:subject and sh:object and include instead sh:value for the
> value node involved and sh:context for the kind of constraint.  This appears
> to be better to me as it reduces the amount of work that has to be done in
> the SPARQL code.
>
>
> The third problem can be overcome by a more sophisticated way of generating
> the messages, such as having a macro that would expand to either "a value
> of <p> for <f>", or "a value of the inverse of <p> for <f>", or "<f>" for
> <f> a focus node and <p> a property.
>
>
> So what then would the SPARQL code for constraint components look like?
>
> Simple constraint components like sh:class are dominated by the boilerplate
>
> SELECT $this $predicate ?value ?context $class
> WHERE {
>  { $this $predicate ?value .
>    FILTER ( sameTerm($context,sh:PropertyConstraint) )
>  } UNION {
>      ?value $predicate $this .
>    FILTER ( sameTerm($context,sh:InversePropertyConstraint) )
>    } UNION {
>    BIND ( $this AS ?value )
>    FILTER ( sameTerm($context,sh:NodeConstraint) )
>  }
>  FILTER EXISTS { $value rdf:type/rdfs:subClassOf* $class }
>        }
>
> This isn't as nice as the current ask validators but doesn't need that extra
> capability.
>
> Some constraint components that cannot be handled by ask validators are
> nearly as simple.  The SPARQL code for sh:minCount would be
>
> SELECT $this $predicate ?context $minCount
> WHERE {
>  { $this $predicate ?value .
>    FILTER ( sameTerm($context,sh:PropertyConstraint) )
>  } UNION {
>      ?value $predicate $this .
>    FILTER ( sameTerm($context,sh:InversePropertyConstraint) )
>    } UNION {
>    BIND ( $this AS ?value )
>    FILTER ( sameTerm($context,sh:NodeConstraint) )
>  }
>        }
> HAVING ( COUNT (DISTINCT ?value) < $minCount )
>
> The SPARQL code for sh:disjoint would be
>
> SELECT $this $predicate ?value ?context
> WHERE {
>  { $this $predicate ?value .
>    FILTER ( sameTerm($context,sh:PropertyConstraint) )
>  } UNION {
>      ?value $predicate $this .
>    FILTER ( sameTerm($context,sh:InversePropertyConstraint) )
>    } UNION {
>    BIND ( $this AS ?value )
>    FILTER ( sameTerm($context,sh:NodeConstraint) )
>  }
>  $this $disjoint ?value .
>        }
>
> The SPARQL code for sh:equals requires the boilerplate twice
>
> SELECT $this $predicate ?value ?context
> WHERE {
>  {
>    { $this $predicate ?value .
>      FILTER ( sameTerm($context,sh:PropertyConstraint) )
>    } UNION {
>        ?value $predicate $this .
>      FILTER ( sameTerm($context,sh:InversePropertyConstraint) )
>      } UNION {
>      BIND ( $this AS ?value )
>      FILTER ( sameTerm($context,sh:NodeConstraint) )
>    }
>    FILTER NOT EXISTS { $this $equals ?value }
>         } UNION
>         {
>    $this $equals ?value .
>    FILTER NOT EXISTS {
>    { $this $predicate ?value .
>      FILTER ( sameTerm($context,sh:PropertyConstraint) )
>    } UNION {
>        ?value $predicate $this .
>      FILTER ( sameTerm($context,sh:InversePropertyConstraint) )
>      } UNION {
>      BIND ( $this AS ?value )
>      FILTER ( sameTerm($context,sh:NodeConstraint) )
>    }
>    }
>  }
>        }
>
>
> peter
>
Received on Friday, 3 June 2016 00:30:20 UTC