Re: implementation of core SHACL (using proposed syntax) from Peter F. Patel-Schneider on 2016-04-21 (public-data-shapes-wg@w3.org from April 2016)

From: Peter F. Patel-Schneider <pfpschneider@gmail.com>
Date: Thu, 21 Apr 2016 05:07:58 -0700
To: Holger Knublauch <holger@topquadrant.com>, public-data-shapes-wg@w3.org
Message-ID: <5718C29E.1070300@gmail.com>
Here is a simple shape.

ex:pc a sh:Shape ;
  sh:scopeClass ex:pcC ;
  sh:propValues ( ex:p [ a sh:Shape ; sh:class ex:pcPC ] ) .

Here is the ugly query it turns into

PREFIX sh: <http://www.w3.org/ns/shacl#>
# SHAPE start <http://ex.com/pc>
  SELECT ?parent ?this ?subject ?predicate ?object
 ?thisShape ?childShape ?severity ?component ?message
  WHERE # SHAPE body
 { { # newContext
  SELECT DISTINCT ?parent ?this ?subject ?predicate ?object ?thisShape
?childShape ?severity ?component ?message
  WHERE { { # SHAPE start "ub1bL311C24"
  SELECT ?parent ?this ?subject ?predicate ?object
 ?thisShape ?childShape ?severity ?component ?message
  WHERE # SHAPE body
 { {   SELECT ?parent ?this (?this AS ?subject) ("ub1bL311C24" AS ?thisShape)
# FRAGMENT
         (<http://www.w3.org/ns/shacl#Violation> AS ?severity)
(<http://www.w3.org/ns/shacl#class> AS ?component) ("Does not have required
class ex:pcPC" AS ?message)
  WHERE { { SELECT (?p AS ?parent) (?gp AS ?grandparent)
 WHERE { { SELECT (?this AS ?p) (?parent AS ?gp) WHERE { ?this
rdf:type/rdfs:subClassOf* <http://ex.com/pcC> . } }
 } } { ?parent <http://ex.com/p> ?this . }
          FILTER ( ! EXISTS { ?this rdf:type/rdfs:subClassOf*
<http://ex.com/pcPC> } ) } }
        } # SHAPE end "ub1bL311C24"
 } UNION
   { SELECT ?parent ?this (?this AS ?subject) (ex:p AS ?predicate) ?object
  (<http://ex.com/pc> as ?thisShape) ("ub1bL311C24" as ?childShape)
(<http://www.w3.org/ns/shacl#Violation> AS ?severity)
(<http://www.w3.org/ns/shacl#propValues> AS ?component) ("In path ex:p " AS
?message)
     WHERE {
         { SELECT (?o AS ?object) (?p AS ?this) WHERE
    { SELECT (?parent AS ?p) (?this AS ?o ) WHERE {
   { # SHAPE start "ub1bL311C24"
  SELECT ?parent ?this ?subject ?predicate ?object
 ?thisShape ?childShape ?severity ?component ?message
  WHERE # SHAPE body
 { {   SELECT ?parent ?this (?this AS ?subject) ("ub1bL311C24" AS ?thisShape)
# FRAGMENT
         (<http://www.w3.org/ns/shacl#Violation> AS ?severity)
(<http://www.w3.org/ns/shacl#class> AS ?component) ("Does not have required
class ex:pcPC" AS ?message)
  WHERE { { SELECT (?p AS ?parent) (?gp AS ?grandparent)
 WHERE { { SELECT (?this AS ?p) (?parent AS ?gp) WHERE { ?this
rdf:type/rdfs:subClassOf* <http://ex.com/pcC> . } }
 } } { ?parent <http://ex.com/p> ?this . }
          FILTER ( ! EXISTS { ?this rdf:type/rdfs:subClassOf*
<http://ex.com/pcPC> } ) } }
        } # SHAPE end "ub1bL311C24"
     } FILTER ( sameTerm(?thisShape,"ub1bL311C24") )
  } } }
   ?this rdf:type/rdfs:subClassOf* <http://ex.com/pcC> .
     }
   }
 } }
        } # SHAPE end <http://ex.com/pc>

I have worked with the SPARQL implementation in RDFLIB.  This implementation
is quirky and buggy in a number of ways so the above query may not be valid
SPARQL.

There are other ways of proceeding.  One way, which might be more efficient
for core SHACL, would be to emit a number of queries for each shape and not
use subqueries at all.

Here are a set of results that are returned

PARENT ex:pc2 THIS ex:pcp2 SE sh:Violation C sh:class SH "ub1bL311C24" CSH S
ex:pcp2 P O MESSAGE "Does not have required class ex:pcPC"
PARENT ex:pc3 THIS ex:pcp4 SE sh:Violation C sh:class SH "ub1bL311C24" CSH S
ex:pcp4 P O MESSAGE "Does not have required class ex:pcPC"
PARENT ex:pc3 THIS ex:pcp2 SE sh:Violation C sh:class SH "ub1bL311C24" CSH S
ex:pcp2 P O MESSAGE "Does not have required class ex:pcPC"
PARENT THIS ex:pc2 SE sh:Violation C sh:propValues SH ex:pc CSH "ub1bL311C24"
S ex:pc2 P ex:p O ex:pcp2 MESSAGE "In path ex:p "
PARENT THIS ex:pc3 SE sh:Violation C sh:propValues SH ex:pc CSH "ub1bL311C24"
S ex:pc3 P ex:p O ex:pcp4 MESSAGE "In path ex:p "
PARENT THIS ex:pc3 SE sh:Violation C sh:propValues SH ex:pc CSH "ub1bL311C24"
S ex:pc3 P ex:p O ex:pcp2 MESSAGE "In path ex:p "

The RDF-style violations are constructed by taking the top-level solutions
with unbound parent and linking them to other solutions where this=parent
(THIS and PARENT in the printed results) and childShape=thisShape (CSH and
SH).


There are two problems making this work for recursive shapes.  First my
generation method would go into an infinite loop.  That could be fixed but
then the triple "chase" would have to be returned so that recursive
violations can be propagated back through the property links.  This can be
done as well but would end up looking more than a bit like just grabbing the
entire RDF graph and processing it outside of SPARQL.




On 04/21/2016 02:29 AM, Holger Knublauch wrote:
> Hi Peter,
> 
> I still don't understand how this works. Could you print some SPARQL that is
> being generated?
> 
> What I don't understand is that if you turn sh:valueShape into nested
> validation calls, then how do the inner result sets ever get turned into
> validation result resources? If you have somehow solved this, then why does
> the same approach not work for recursion?
> 
> (This may be obvious to you as the implementer, but is not clear to me)
> 
> Thanks
> Holger
> 
> 
> On 18/04/2016 13:37, Peter F. Patel-Schneider wrote:
>> Here is an extract from my implementation notes that describes how result sets
>> can be augmented and combined to produce detailed validation results.
>>
>>
>> The result sets contain bindings for the following variables:
>>    Variable    SHACL predicate        description
>>    parent                      the node that was traversed to get
>>                        to the focus node, if any
>>    this        sh:focusNode        focus node
>>    PS        sh:sourceShape        identifier for shape that produced the
>> result
>>    CS                    identifier for embedded shape, if any
>>    subject    sh:subject        usually focus node of violation
>>    predicate    sh:predicate        property or path involved in violation
>>    object    sh:object
>>    severity    sh:severity        severity of result
>>    shape                    identifier for top-level shape (probably not
>> needed)
>>    component    sh:sourceTemplate    component property that produced the
>> result
>>    message    sh:message        a human readable message
>>
>> Identifiers for shapes are the node itself if the shape is an IRI and a
>> unique identifier for shapes that are blank nodes.  Note that for inverse
>> properties, the subject variable is the object of the triple that produced
>> the violation and the object variable is the subject of the triple.
>>
>> A node validates against a shape if there is no solution in the result set
>> with the variable this bound to the node or unbound, the variable PS
>> bound to the identifier for the shape, and the variable severity bound to
>> sh:Violation.
>>
>> The SHACL validation results graph can be constructed by first creating a
>> blank node for each solution in the result set and adding triples with it as
>> subject as above.  For solutions that have the CS variable bound a sh:detail
>> triple is added for each other solution that has its PS variable bound to
>> the CS binding in this solution and the focus node of this solution as its
>> binding for the parent variable.  This triple has the node for this solution
>> as its subject and the node for the other solution as its object.
>>
>>
>>
>> On 04/17/2016 07:07 PM, Holger Knublauch wrote:
>>> On 14/04/2016 5:29, Peter F. Patel-Schneider wrote:
>>>> On 04/13/2016 02:49 AM, Peter F. Patel-Schneider wrote:
>>>>> On 04/12/2016 10:40 PM, Holger Knublauch wrote:
>>>>>> On 13/04/2016 1:11, Peter F. Patel-Schneider wrote:
>>>>>> [...]
>>>>>> This also confirms two limitations of this single-query-transformation
>>>>>> approach (we had discussed this before):
>>>>>> - inability to generate nested validation results
>>>>>> - inability to handle recursion
>>>>>> The current design uses sh:hasShape which doesn't have these limitations.
>>>>>>
>>>>>> Holger
>>>> I just made a minor modification to the way that validation results are
>>>> combined to allow for nested validation results.
>>>>
>>>> peter
>>>>
>>> Hi Peter,
>>>
>>> you have made me curious here. Would you mind providing some details or an
>>> example? I assume we are talking about a case such as a sh:valueShape that
>>> fails, and its result object should point to other validation results for each
>>> value that does not match the shape, via sh:detail. So I was expecting to find
>>> some reference to sh:detail in your implementation.
>>>
>>> Thanks
>>> Holger
>>>
>>> PS: Apologies in advance for not responding to the other open email threads
>>> yet - I am trying to focus on updating my implementation and the advanced
>>> sections of the spec to the metamodel 3 draft. I will get back to the other
>>> emails once this block of work is completed.
>>>
>>>
> 
>
Received on Thursday, 21 April 2016 12:08:29 UTC