Re: Boierplate macro (Was: Re: ISSUE-139: single implementations of all core constraint components) from Karen Coyle on 2016-06-16 (public-data-shapes-wg@w3.org from June 2016)

From: Karen Coyle <kcoyle@kcoyle.net>
Date: Thu, 16 Jun 2016 08:13:33 -0700
To: public-data-shapes-wg@w3.org
Message-ID: <16937497-b8b8-78b1-be3c-227c7a2b0c83@kcoyle.net>
I actually see that as a reason to separate the extension mechanism from 
the core (into different documents) -- because the extension mechanism 
is optional. We still have many problems in the core that need to be 
worked on.

kc

On 6/15/16 3:52 PM, Holger Knublauch wrote:
> The topic of macros only concerns the SPARQL extension mechanism, which
> is part of SHACL.
>
> Holger
>
>
> On 16/06/2016 1:29, Karen Coyle wrote:
>> I don't see how the topic of macros is relevant to what we are doing
>> here, which is, as I recall, developing a *language* for validation of
>> rdf graphs. SPARQL was to be used as the formalism for defining the
>> actual constraints described "to the extent possible" but at no time
>> did we decide that the group was producing the code for a SHACL
>> implementation nor that SHACL would be written in SPARQL.
>>
>> kc
>>
>> On 6/14/16 3:57 PM, Holger Knublauch wrote:
>>> I am not keen on macros - people should still be able to understand
>>> what's happening, and have valid SPARQL queries in front of them.
>>> However, since the introduction of path constraints we are already in
>>> the code injection business anyway, so we could just as well do it
>>> right.
>>>
>>> The boilerplate solution is basically a generalization of the templates
>>> used for ASK queries. We all share the goal of reusing code. At the same
>>> time, I am against limiting what people can express because the
>>> framework would only support certain patterns. For example there are
>>> cases where queries need to iterate over the same value set twice
>>> (comparing values), or to do existential checks instead of allValues
>>> iteration, see sh:hasValue.
>>>
>>> With all this said, here is what I believe may work well. I have not
>>> tried this out yet, so I may be missing something.
>>>
>>> As placeholder for the path, let's use a special variable $path that can
>>> only be used as predicate of triple patterns. Depending on the context,
>>> the engine would replace those triple patterns with suitable code. For
>>> node constraints, this path is interpreted as "the empty path starting
>>> at the focus node, i.e. the focus node itself".
>>>
>>> The standard placeholder line would be
>>>
>>>     $this $path ?value .
>>>
>>> which for example would be replaced with
>>>
>>>     $this ex:parent ?value .   (property)
>>>     ?value ex:parent $this .   (inv property)
>>>     $this ex:parent/ex:parent ?value   . (path)
>>>     BIND ($this AS ?value) .     (node constraint)
>>>
>>> but these variations would also be allowed:
>>>
>>>     $this $path ?value2 .
>>>
>>> which would allow binding more than one variable. And
>>>
>>>     $this $path $hasValue .
>>>
>>> to do an existential match. (For node constraints this could become a
>>> FILTER sameTerm etc)
>>>
>>> Maybe even
>>>
>>>     ?value $path $this .
>>>
>>> to walk the inverse direction. A nice thing about this syntax is that we
>>> only need to switch from $predicate to $path in our SPARQL snippets
>>> (section 4).
>>>
>>> We can take this further and generalize the special status of $path to
>>> apply to other path structures such as sh:equals, sh:lessThan. This
>>> would allow all kinds of paths to be used together with no extra costs
>>> to the implementation. Such constraint parameters could be marked, e.g.
>>> with sh:shape sh:Path .
>>>
>>> This pattern language may cover a sufficiently large set of use cases,
>>> and we could use that as a starting point until someone comes up with
>>> scenarios that cannot be expressed this way. As with everything related
>>> to the extension mechanism, it is very hard to anticipate all possible
>>> use cases, yet here I am quite optimistic that a good middle ground is
>>> found that supports code generalization without sacrificing
>>> expressivity.
>>>
>>> Holger
>>>
>>>
>>> On 10/06/2016 16:32, Dimitris Kontokostas wrote:
>>>> if we go this way, should we define something like a macro that
>>>> replaces [boilerplate] to the actual boilerplate?
>>>> It would make the queries easier to read and harder to make copy/paste
>>>> mistakes
>>>> something like %%BOILETPLATE%% or ##BOILETPLATE##
>>>> the first one makes the query invalid as a SPARQL query before we
>>>> replace the boilerplate code while the latter can be seen as a comment
>>>> but has the risk of being removed by mistake
>>>>
>>>> Dimitris
>>>>
>>>> On Fri, Jun 10, 2016 at 4:44 AM, Peter F. Patel-Schneider
>>>> <pfpschneider@gmail.com> wrote:
>>>>
>>>>     The boilerplate is SPARQL code what with $this, $context, and
>>>>     $predicate
>>>>     pre-bound and ?value not in scope produces solutions for ?value as
>>>>     the value
>>>>     nodes.  It can either bind ?subject and ?object as appropriate or
>>>>     these can
>>>>     be determined by the governing code, which is the solution used
>>>>     here.  A
>>>>     suitable boilerplate that does not use anything beyond pre-binding
>>>>     is given
>>>>     here, but it would also be possible to do something more
>>>>     sophisticated.
>>>>
>>>>     [boilerplate]
>>>>
>>>>     { $this $predicate ?value .
>>>>       FILTER ( sameTerm($context,sh:PropertyConstraint) )
>>>>     } UNION {
>>>>       ?value $predicate $this .
>>>>       FILTER ( sameTerm($context,sh:InversePropertyConstraint) )
>>>>     } UNION {
>>>>       BIND ( $this AS ?value )
>>>>       FILTER ( sameTerm($context,sh:NodeConstraint) )
>>>>     }
>>>>
>>>>     The code for the core constraint components is then as follows:
>>>>
>>>>     sh:class
>>>>
>>>>     SELECT $this ?value WHERE {
>>>>       [boilerplate]
>>>>       FILTER NOT EXISTS { ?value rdf:type/rdfs:subClassOf* $class }
>>>>     }
>>>>
>>>>     sh:classIn
>>>>
>>>>     SELECT $this ?value WHERE {
>>>>       [boilerplate]
>>>>       FILTER NOT EXISTS {
>>>>             GRAPH $shapesGraph { $classIn (rdf:rest*)/rdf:first
>>>> ?class . }
>>>>             ?value rdf:type/rdfs:subClassOf* ?class .
>>>>       }
>>>>     }
>>>>
>>>>     sh:datatype
>>>>
>>>>     SELECT $this ?value WHERE {
>>>>       [boilerplate]
>>>>       FILTER NOT ( isLiteral(?value) && datatype(?value) = $datatype )
>>>>     }
>>>>
>>>>     sh:datatypeIn
>>>>
>>>>     SELECT $this ?value WHERE {
>>>>       [boilerplate]
>>>>       FILTER NOT EXISTS {
>>>>             GRAPH $shapesGraph { $classIn (rdf:rest*)/rdf:first
>>>>     ?datatype . }
>>>>             FILTER ( isLiteral(?value) && datatype(?value) =
>>>> ?datatype )
>>>>       }
>>>>     }
>>>>
>>>>     sh:maxExclusive
>>>>
>>>>     SELECT $this ?value WHERE {
>>>>       [boilerplate]
>>>>       FILTER (?value < $maxExclusive)
>>>>     }
>>>>
>>>>     sh:maxInclusive is similar
>>>>     sh:minExclusive is similar
>>>>     sh:minInclusive is similar
>>>>
>>>>     sh:in
>>>>
>>>>     SELECT $this ?value WHERE {
>>>>       [boilerplate]
>>>>       FILTER NOT EXISTS {
>>>>             GRAPH $shapesGraph { $in (rdf:rest*)/rdf:first ?value . }
>>>>       }
>>>>     }
>>>>
>>>>     sh:minLength
>>>>
>>>>     SELECT $this ?value WHERE {
>>>>       [boilerplate]
>>>>       FILTER NOT ( !isBlank(?value) && STRLEN(STR(?value)) >=
>>>> $minLength )
>>>>     }
>>>>
>>>>     sh:maxLength is similar
>>>>
>>>>     sh:nodeKind
>>>>
>>>>     SELECT $this ?value WHERE {
>>>>       [boilerplate]
>>>>       FILTER NOT
>>>>         ((isIRI(?value) && $nodeKind IN (sh:IRI, sh:BlankNodeOrIRI,
>>>>     sh:IRIOrLiteral) ) ||
>>>>          (isLiteral(?value) && $nodeKind IN ( sh:Literal,
>>>>     sh:BlankNodeOrLiteral,
>>>>     sh:IRIOrLiteral)) ||
>>>>          (isBlank(?value) && $nodeKind IN ( sh:BlankNode,
>>>>     sh:BlankNodeOrLiteral,
>>>>     sh:BlankNodeOrLiteral)))
>>>>     }
>>>>
>>>>     sh:pattern
>>>>
>>>>     SELECT $this ?value WHERE {
>>>>       [boilerplate]
>>>>       FILTER NOT (!isBlank(?value) && IF(bound($flags),
>>>>                                          regex(str(?value), $pattern,
>>>>     $flags),
>>>>                                          regex(str(?value), $pattern)))
>>>>     }
>>>>
>>>>     sh:stem
>>>>
>>>>     SELECT $this ?value WHERE {
>>>>       [boilerplate]
>>>>       FILTER NOT (isIRI(?value) && STRSTARTS(str(?value), $stem))
>>>>     }
>>>>
>>>>     sh:shape
>>>>
>>>>     SELECT $this ?value ?failure WHERE {
>>>>       [boilerplate]
>>>>       BIND (sh:hasShape(?value, $shape, $shapesGraph) AS ?hasShape) .
>>>>       BIND (!bound(?hasShape) AS ?failure) .
>>>>       FILTER (?failure || !?hasShape) .
>>>>     }
>>>>
>>>>     sh:hasValue
>>>>
>>>>     SELECT $this ?value WHERE {
>>>>       FILTER NOT EXISTS {
>>>>               [boilerplate]
>>>>               FILTER (sameTerm(?value,$hasValue) )
>>>>     }
>>>>
>>>>     sh:maxCount
>>>>
>>>>     SELECT SAMPLE($this) WHERE {
>>>>       [boilerplate]
>>>>     } HAVING ( COUNT ( DISTINCT ?value ) > $maxCount )
>>>>
>>>>     sh:minCount is similar
>>>>
>>>>     sh:equals
>>>>
>>>>     SELECT $this ?value WHERE {
>>>>       {
>>>>         [boilerplate]
>>>>         MINUS { $this $equals ?value . }
>>>>       } UNION {
>>>>         $this $equals ?value .
>>>>         MINUS  { [boilerplate] }
>>>>       }
>>>>     }
>>>>
>>>>     sh:disjoint
>>>>
>>>>     SELECT $this ?value WHERE {
>>>>       [boilerplate]
>>>>       $this $disjoint ?value .
>>>>     }
>>>>
>>>>     sh:lessThan
>>>>
>>>>     SELECT $this ?value WHERE {
>>>>       [boilerplate]
>>>>       $this $lessThan ?value2 .
>>>>       FILTER (!(?value < ?value2))
>>>>     }
>>>>
>>>>     sh:lessThanOrEquals is similar
>>>>
>>>>     sh:uniqueLang
>>>>
>>>>     SELECT SAMPLE($this) ?lang WHERE {
>>>>       { FILTER ($uniqueLang) }
>>>>       [boilerplate]
>>>>       BIND (lang(?value) AS ?lang)
>>>>       FILTER (isLiteral(?value) && bound(?lang) && ?lang != "")
>>>>     } GROUP BY ?lang HAVING ( COUNT(?this) > 1 )
>>>>
>>>>     sh:qualifiedMaxCount
>>>>
>>>>     SELECT SAMPLE($this) ( SUM(?failed)>0 AS ?failure ) WHERE {
>>>>       [boilerplate]
>>>>       BIND  (sh:hasShape(?value, $qualifiedValueShape, $shapesGraph)
>>>>     AS ?hasShape)
>>>>       BIND (!bound(?hasShape) AS ?failure)
>>>>       BIND ( IF(?failure,1,0) AS ?failed )
>>>>       FILTER ( IF(?failure, true, ?hasShape) )
>>>>     } HAVING ( ( COUNT ( DISTINCT ?value ) > $qualifiedMaxCount ) ||
>>>>                ( SUM(?failed) > 0 ) )
>>>>
>>>>     sh:qualifiedMinCount is similar
>>>>
>>>>     sh:closed
>>>>
>>>>     SELECT $this ?value WHERE {
>>>>       { FILTER ($closed) }
>>>>       [boilerplate]
>>>>       ?value ?predicate ?object .
>>>>       FILTER (NOT EXISTS {
>>>>           GRAPH $shapesGraph { $currentShape sh:property/sh:predicate
>>>>     ?predicate . }
>>>>         } &&
>>>>         ( !bound($ignoredProperties) ||
>>>>           NOT EXISTS {
>>>>             GRAPH $shapesGraph { $ignoredProperties
>>>>     rdf:rest*/rdf:first ?predicate . }
>>>>           }
>>>>         ) )
>>>>     }
>>>>
>>>>     sh:not
>>>>
>>>>     SELECT $this ?value ?failure WHERE {
>>>>       [boilerplate]
>>>>       BIND (sh:hasShape(?value, $not, $shapesGraph) AS ?hasShape) .
>>>>       BIND (!bound(?hasShape) AS ?failure) .
>>>>       FILTER (?failure || ?hasShape) .
>>>>     }
>>>>
>>>>     sh:and
>>>>
>>>>     SELECT $this ?value ?failure WHERE {
>>>>       [boilerplate]
>>>>       GRAPH $shapesGraph { $and (rdf:rest*)/rdf:first ?conjunct . }
>>>>       BIND (sh:hasShape(?value, ?conjunct, $shapesGraph) AS ?hasShape)
>>>>       BIND (!bound(?hasShape) AS ?failure)
>>>>       FILTER (?failure || !?hasShape)
>>>>     }
>>>>
>>>>     sh:or
>>>>
>>>>     SELECT $this ?value WHERE {
>>>>       [boilerplate]
>>>>       FILTER NOT EXISTS {
>>>>         GRAPH $shapesGraph { $or (rdf:rest*)/rdf:first ?disjunct . }
>>>>         BIND (sh:hasShape(?value, ?disjunct, $shapesGraph) AS
>>>> ?hasShape)
>>>>         BIND (!bound(?hasShape) AS ?failure)
>>>>         FILTER ( !?failure || ?hasShape )
>>>>       }
>>>>     }
>>>>
>>>>
>>>>     peter
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Dimitris Kontokostas
>>>> Department of Computer Science, University of Leipzig & DBpedia
>>>> Association
>>>> Projects: http://dbpedia.org, http://rdfunit.aksw.org,
>>>> http://aligned-project.eu
>>>> Homepage: http://aksw.org/DimitrisKontokostas
>>>> Research Group: AKSW/KILT http://aksw.org/Groups/KILT
>>>>
>>>
>>
>
>
>

-- 
Karen Coyle
kcoyle@kcoyle.net http://kcoyle.net
m: 1-510-435-8234
skype: kcoylenet/+1-510-984-3600
Received on Thursday, 16 June 2016 15:13:59 UTC