Re: Boierplate macro (Was: Re: ISSUE-139: single implementations of all core constraint components) from Karen Coyle on 2016-06-15 (public-data-shapes-wg@w3.org from June 2016)

From: Karen Coyle <kcoyle@kcoyle.net>
Date: Wed, 15 Jun 2016 08:29:59 -0700
To: public-data-shapes-wg@w3.org
Message-ID: <68fbc11d-54d7-6489-ca68-bc39868dea8f@kcoyle.net>
I don't see how the topic of macros is relevant to what we are doing 
here, which is, as I recall, developing a *language* for validation of 
rdf graphs. SPARQL was to be used as the formalism for defining the 
actual constraints described "to the extent possible" but at no time did 
we decide that the group was producing the code for a SHACL 
implementation nor that SHACL would be written in SPARQL.

kc

On 6/14/16 3:57 PM, Holger Knublauch wrote:
> I am not keen on macros - people should still be able to understand
> what's happening, and have valid SPARQL queries in front of them.
> However, since the introduction of path constraints we are already in
> the code injection business anyway, so we could just as well do it right.
>
> The boilerplate solution is basically a generalization of the templates
> used for ASK queries. We all share the goal of reusing code. At the same
> time, I am against limiting what people can express because the
> framework would only support certain patterns. For example there are
> cases where queries need to iterate over the same value set twice
> (comparing values), or to do existential checks instead of allValues
> iteration, see sh:hasValue.
>
> With all this said, here is what I believe may work well. I have not
> tried this out yet, so I may be missing something.
>
> As placeholder for the path, let's use a special variable $path that can
> only be used as predicate of triple patterns. Depending on the context,
> the engine would replace those triple patterns with suitable code. For
> node constraints, this path is interpreted as "the empty path starting
> at the focus node, i.e. the focus node itself".
>
> The standard placeholder line would be
>
>     $this $path ?value .
>
> which for example would be replaced with
>
>     $this ex:parent ?value .   (property)
>     ?value ex:parent $this .   (inv property)
>     $this ex:parent/ex:parent ?value   . (path)
>     BIND ($this AS ?value) .     (node constraint)
>
> but these variations would also be allowed:
>
>     $this $path ?value2 .
>
> which would allow binding more than one variable. And
>
>     $this $path $hasValue .
>
> to do an existential match. (For node constraints this could become a
> FILTER sameTerm etc)
>
> Maybe even
>
>     ?value $path $this .
>
> to walk the inverse direction. A nice thing about this syntax is that we
> only need to switch from $predicate to $path in our SPARQL snippets
> (section 4).
>
> We can take this further and generalize the special status of $path to
> apply to other path structures such as sh:equals, sh:lessThan. This
> would allow all kinds of paths to be used together with no extra costs
> to the implementation. Such constraint parameters could be marked, e.g.
> with sh:shape sh:Path .
>
> This pattern language may cover a sufficiently large set of use cases,
> and we could use that as a starting point until someone comes up with
> scenarios that cannot be expressed this way. As with everything related
> to the extension mechanism, it is very hard to anticipate all possible
> use cases, yet here I am quite optimistic that a good middle ground is
> found that supports code generalization without sacrificing expressivity.
>
> Holger
>
>
> On 10/06/2016 16:32, Dimitris Kontokostas wrote:
>> if we go this way, should we define something like a macro that
>> replaces [boilerplate] to the actual boilerplate?
>> It would make the queries easier to read and harder to make copy/paste
>> mistakes
>> something like %%BOILETPLATE%% or ##BOILETPLATE##
>> the first one makes the query invalid as a SPARQL query before we
>> replace the boilerplate code while the latter can be seen as a comment
>> but has the risk of being removed by mistake
>>
>> Dimitris
>>
>> On Fri, Jun 10, 2016 at 4:44 AM, Peter F. Patel-Schneider
>> <pfpschneider@gmail.com> wrote:
>>
>>     The boilerplate is SPARQL code what with $this, $context, and
>>     $predicate
>>     pre-bound and ?value not in scope produces solutions for ?value as
>>     the value
>>     nodes.  It can either bind ?subject and ?object as appropriate or
>>     these can
>>     be determined by the governing code, which is the solution used
>>     here.  A
>>     suitable boilerplate that does not use anything beyond pre-binding
>>     is given
>>     here, but it would also be possible to do something more
>>     sophisticated.
>>
>>     [boilerplate]
>>
>>     { $this $predicate ?value .
>>       FILTER ( sameTerm($context,sh:PropertyConstraint) )
>>     } UNION {
>>       ?value $predicate $this .
>>       FILTER ( sameTerm($context,sh:InversePropertyConstraint) )
>>     } UNION {
>>       BIND ( $this AS ?value )
>>       FILTER ( sameTerm($context,sh:NodeConstraint) )
>>     }
>>
>>     The code for the core constraint components is then as follows:
>>
>>     sh:class
>>
>>     SELECT $this ?value WHERE {
>>       [boilerplate]
>>       FILTER NOT EXISTS { ?value rdf:type/rdfs:subClassOf* $class }
>>     }
>>
>>     sh:classIn
>>
>>     SELECT $this ?value WHERE {
>>       [boilerplate]
>>       FILTER NOT EXISTS {
>>             GRAPH $shapesGraph { $classIn (rdf:rest*)/rdf:first ?class . }
>>             ?value rdf:type/rdfs:subClassOf* ?class .
>>       }
>>     }
>>
>>     sh:datatype
>>
>>     SELECT $this ?value WHERE {
>>       [boilerplate]
>>       FILTER NOT ( isLiteral(?value) && datatype(?value) = $datatype )
>>     }
>>
>>     sh:datatypeIn
>>
>>     SELECT $this ?value WHERE {
>>       [boilerplate]
>>       FILTER NOT EXISTS {
>>             GRAPH $shapesGraph { $classIn (rdf:rest*)/rdf:first
>>     ?datatype . }
>>             FILTER ( isLiteral(?value) && datatype(?value) = ?datatype )
>>       }
>>     }
>>
>>     sh:maxExclusive
>>
>>     SELECT $this ?value WHERE {
>>       [boilerplate]
>>       FILTER (?value < $maxExclusive)
>>     }
>>
>>     sh:maxInclusive is similar
>>     sh:minExclusive is similar
>>     sh:minInclusive is similar
>>
>>     sh:in
>>
>>     SELECT $this ?value WHERE {
>>       [boilerplate]
>>       FILTER NOT EXISTS {
>>             GRAPH $shapesGraph { $in (rdf:rest*)/rdf:first ?value . }
>>       }
>>     }
>>
>>     sh:minLength
>>
>>     SELECT $this ?value WHERE {
>>       [boilerplate]
>>       FILTER NOT ( !isBlank(?value) && STRLEN(STR(?value)) >= $minLength )
>>     }
>>
>>     sh:maxLength is similar
>>
>>     sh:nodeKind
>>
>>     SELECT $this ?value WHERE {
>>       [boilerplate]
>>       FILTER NOT
>>         ((isIRI(?value) && $nodeKind IN (sh:IRI, sh:BlankNodeOrIRI,
>>     sh:IRIOrLiteral) ) ||
>>          (isLiteral(?value) && $nodeKind IN ( sh:Literal,
>>     sh:BlankNodeOrLiteral,
>>     sh:IRIOrLiteral)) ||
>>          (isBlank(?value) && $nodeKind IN ( sh:BlankNode,
>>     sh:BlankNodeOrLiteral,
>>     sh:BlankNodeOrLiteral)))
>>     }
>>
>>     sh:pattern
>>
>>     SELECT $this ?value WHERE {
>>       [boilerplate]
>>       FILTER NOT (!isBlank(?value) && IF(bound($flags),
>>                                          regex(str(?value), $pattern,
>>     $flags),
>>                                          regex(str(?value), $pattern)))
>>     }
>>
>>     sh:stem
>>
>>     SELECT $this ?value WHERE {
>>       [boilerplate]
>>       FILTER NOT (isIRI(?value) && STRSTARTS(str(?value), $stem))
>>     }
>>
>>     sh:shape
>>
>>     SELECT $this ?value ?failure WHERE {
>>       [boilerplate]
>>       BIND (sh:hasShape(?value, $shape, $shapesGraph) AS ?hasShape) .
>>       BIND (!bound(?hasShape) AS ?failure) .
>>       FILTER (?failure || !?hasShape) .
>>     }
>>
>>     sh:hasValue
>>
>>     SELECT $this ?value WHERE {
>>       FILTER NOT EXISTS {
>>               [boilerplate]
>>               FILTER (sameTerm(?value,$hasValue) )
>>     }
>>
>>     sh:maxCount
>>
>>     SELECT SAMPLE($this) WHERE {
>>       [boilerplate]
>>     } HAVING ( COUNT ( DISTINCT ?value ) > $maxCount )
>>
>>     sh:minCount is similar
>>
>>     sh:equals
>>
>>     SELECT $this ?value WHERE {
>>       {
>>         [boilerplate]
>>         MINUS { $this $equals ?value . }
>>       } UNION {
>>         $this $equals ?value .
>>         MINUS  { [boilerplate] }
>>       }
>>     }
>>
>>     sh:disjoint
>>
>>     SELECT $this ?value WHERE {
>>       [boilerplate]
>>       $this $disjoint ?value .
>>     }
>>
>>     sh:lessThan
>>
>>     SELECT $this ?value WHERE {
>>       [boilerplate]
>>       $this $lessThan ?value2 .
>>       FILTER (!(?value < ?value2))
>>     }
>>
>>     sh:lessThanOrEquals is similar
>>
>>     sh:uniqueLang
>>
>>     SELECT SAMPLE($this) ?lang WHERE {
>>       { FILTER ($uniqueLang) }
>>       [boilerplate]
>>       BIND (lang(?value) AS ?lang)
>>       FILTER (isLiteral(?value) && bound(?lang) && ?lang != "")
>>     } GROUP BY ?lang HAVING ( COUNT(?this) > 1 )
>>
>>     sh:qualifiedMaxCount
>>
>>     SELECT SAMPLE($this) ( SUM(?failed)>0 AS ?failure ) WHERE {
>>       [boilerplate]
>>       BIND  (sh:hasShape(?value, $qualifiedValueShape, $shapesGraph)
>>     AS ?hasShape)
>>       BIND (!bound(?hasShape) AS ?failure)
>>       BIND ( IF(?failure,1,0) AS ?failed )
>>       FILTER ( IF(?failure, true, ?hasShape) )
>>     } HAVING ( ( COUNT ( DISTINCT ?value ) > $qualifiedMaxCount ) ||
>>                ( SUM(?failed) > 0 ) )
>>
>>     sh:qualifiedMinCount is similar
>>
>>     sh:closed
>>
>>     SELECT $this ?value WHERE {
>>       { FILTER ($closed) }
>>       [boilerplate]
>>       ?value ?predicate ?object .
>>       FILTER (NOT EXISTS {
>>           GRAPH $shapesGraph { $currentShape sh:property/sh:predicate
>>     ?predicate . }
>>         } &&
>>         ( !bound($ignoredProperties) ||
>>           NOT EXISTS {
>>             GRAPH $shapesGraph { $ignoredProperties
>>     rdf:rest*/rdf:first ?predicate . }
>>           }
>>         ) )
>>     }
>>
>>     sh:not
>>
>>     SELECT $this ?value ?failure WHERE {
>>       [boilerplate]
>>       BIND (sh:hasShape(?value, $not, $shapesGraph) AS ?hasShape) .
>>       BIND (!bound(?hasShape) AS ?failure) .
>>       FILTER (?failure || ?hasShape) .
>>     }
>>
>>     sh:and
>>
>>     SELECT $this ?value ?failure WHERE {
>>       [boilerplate]
>>       GRAPH $shapesGraph { $and (rdf:rest*)/rdf:first ?conjunct . }
>>       BIND (sh:hasShape(?value, ?conjunct, $shapesGraph) AS ?hasShape)
>>       BIND (!bound(?hasShape) AS ?failure)
>>       FILTER (?failure || !?hasShape)
>>     }
>>
>>     sh:or
>>
>>     SELECT $this ?value WHERE {
>>       [boilerplate]
>>       FILTER NOT EXISTS {
>>         GRAPH $shapesGraph { $or (rdf:rest*)/rdf:first ?disjunct . }
>>         BIND (sh:hasShape(?value, ?disjunct, $shapesGraph) AS ?hasShape)
>>         BIND (!bound(?hasShape) AS ?failure)
>>         FILTER ( !?failure || ?hasShape )
>>       }
>>     }
>>
>>
>>     peter
>>
>>
>>
>>
>> --
>> Dimitris Kontokostas
>> Department of Computer Science, University of Leipzig & DBpedia
>> Association
>> Projects: http://dbpedia.org, http://rdfunit.aksw.org,
>> http://aligned-project.eu
>> Homepage: http://aksw.org/DimitrisKontokostas
>> Research Group: AKSW/KILT http://aksw.org/Groups/KILT
>>
>

-- 
Karen Coyle
kcoyle@kcoyle.net http://kcoyle.net
m: 1-510-435-8234
skype: kcoylenet/+1-510-984-3600
Received on Wednesday, 15 June 2016 15:30:26 UTC