Re: Boierplate macro (Was: Re: ISSUE-139: single implementations of all core constraint components)

The topic of macros only concerns the SPARQL extension mechanism, which 
is part of SHACL.

Holger


On 16/06/2016 1:29, Karen Coyle wrote:
> I don't see how the topic of macros is relevant to what we are doing 
> here, which is, as I recall, developing a *language* for validation of 
> rdf graphs. SPARQL was to be used as the formalism for defining the 
> actual constraints described "to the extent possible" but at no time 
> did we decide that the group was producing the code for a SHACL 
> implementation nor that SHACL would be written in SPARQL.
>
> kc
>
> On 6/14/16 3:57 PM, Holger Knublauch wrote:
>> I am not keen on macros - people should still be able to understand
>> what's happening, and have valid SPARQL queries in front of them.
>> However, since the introduction of path constraints we are already in
>> the code injection business anyway, so we could just as well do it 
>> right.
>>
>> The boilerplate solution is basically a generalization of the templates
>> used for ASK queries. We all share the goal of reusing code. At the same
>> time, I am against limiting what people can express because the
>> framework would only support certain patterns. For example there are
>> cases where queries need to iterate over the same value set twice
>> (comparing values), or to do existential checks instead of allValues
>> iteration, see sh:hasValue.
>>
>> With all this said, here is what I believe may work well. I have not
>> tried this out yet, so I may be missing something.
>>
>> As placeholder for the path, let's use a special variable $path that can
>> only be used as predicate of triple patterns. Depending on the context,
>> the engine would replace those triple patterns with suitable code. For
>> node constraints, this path is interpreted as "the empty path starting
>> at the focus node, i.e. the focus node itself".
>>
>> The standard placeholder line would be
>>
>>     $this $path ?value .
>>
>> which for example would be replaced with
>>
>>     $this ex:parent ?value .   (property)
>>     ?value ex:parent $this .   (inv property)
>>     $this ex:parent/ex:parent ?value   . (path)
>>     BIND ($this AS ?value) .     (node constraint)
>>
>> but these variations would also be allowed:
>>
>>     $this $path ?value2 .
>>
>> which would allow binding more than one variable. And
>>
>>     $this $path $hasValue .
>>
>> to do an existential match. (For node constraints this could become a
>> FILTER sameTerm etc)
>>
>> Maybe even
>>
>>     ?value $path $this .
>>
>> to walk the inverse direction. A nice thing about this syntax is that we
>> only need to switch from $predicate to $path in our SPARQL snippets
>> (section 4).
>>
>> We can take this further and generalize the special status of $path to
>> apply to other path structures such as sh:equals, sh:lessThan. This
>> would allow all kinds of paths to be used together with no extra costs
>> to the implementation. Such constraint parameters could be marked, e.g.
>> with sh:shape sh:Path .
>>
>> This pattern language may cover a sufficiently large set of use cases,
>> and we could use that as a starting point until someone comes up with
>> scenarios that cannot be expressed this way. As with everything related
>> to the extension mechanism, it is very hard to anticipate all possible
>> use cases, yet here I am quite optimistic that a good middle ground is
>> found that supports code generalization without sacrificing 
>> expressivity.
>>
>> Holger
>>
>>
>> On 10/06/2016 16:32, Dimitris Kontokostas wrote:
>>> if we go this way, should we define something like a macro that
>>> replaces [boilerplate] to the actual boilerplate?
>>> It would make the queries easier to read and harder to make copy/paste
>>> mistakes
>>> something like %%BOILETPLATE%% or ##BOILETPLATE##
>>> the first one makes the query invalid as a SPARQL query before we
>>> replace the boilerplate code while the latter can be seen as a comment
>>> but has the risk of being removed by mistake
>>>
>>> Dimitris
>>>
>>> On Fri, Jun 10, 2016 at 4:44 AM, Peter F. Patel-Schneider
>>> <pfpschneider@gmail.com> wrote:
>>>
>>>     The boilerplate is SPARQL code what with $this, $context, and
>>>     $predicate
>>>     pre-bound and ?value not in scope produces solutions for ?value as
>>>     the value
>>>     nodes.  It can either bind ?subject and ?object as appropriate or
>>>     these can
>>>     be determined by the governing code, which is the solution used
>>>     here.  A
>>>     suitable boilerplate that does not use anything beyond pre-binding
>>>     is given
>>>     here, but it would also be possible to do something more
>>>     sophisticated.
>>>
>>>     [boilerplate]
>>>
>>>     { $this $predicate ?value .
>>>       FILTER ( sameTerm($context,sh:PropertyConstraint) )
>>>     } UNION {
>>>       ?value $predicate $this .
>>>       FILTER ( sameTerm($context,sh:InversePropertyConstraint) )
>>>     } UNION {
>>>       BIND ( $this AS ?value )
>>>       FILTER ( sameTerm($context,sh:NodeConstraint) )
>>>     }
>>>
>>>     The code for the core constraint components is then as follows:
>>>
>>>     sh:class
>>>
>>>     SELECT $this ?value WHERE {
>>>       [boilerplate]
>>>       FILTER NOT EXISTS { ?value rdf:type/rdfs:subClassOf* $class }
>>>     }
>>>
>>>     sh:classIn
>>>
>>>     SELECT $this ?value WHERE {
>>>       [boilerplate]
>>>       FILTER NOT EXISTS {
>>>             GRAPH $shapesGraph { $classIn (rdf:rest*)/rdf:first 
>>> ?class . }
>>>             ?value rdf:type/rdfs:subClassOf* ?class .
>>>       }
>>>     }
>>>
>>>     sh:datatype
>>>
>>>     SELECT $this ?value WHERE {
>>>       [boilerplate]
>>>       FILTER NOT ( isLiteral(?value) && datatype(?value) = $datatype )
>>>     }
>>>
>>>     sh:datatypeIn
>>>
>>>     SELECT $this ?value WHERE {
>>>       [boilerplate]
>>>       FILTER NOT EXISTS {
>>>             GRAPH $shapesGraph { $classIn (rdf:rest*)/rdf:first
>>>     ?datatype . }
>>>             FILTER ( isLiteral(?value) && datatype(?value) = 
>>> ?datatype )
>>>       }
>>>     }
>>>
>>>     sh:maxExclusive
>>>
>>>     SELECT $this ?value WHERE {
>>>       [boilerplate]
>>>       FILTER (?value < $maxExclusive)
>>>     }
>>>
>>>     sh:maxInclusive is similar
>>>     sh:minExclusive is similar
>>>     sh:minInclusive is similar
>>>
>>>     sh:in
>>>
>>>     SELECT $this ?value WHERE {
>>>       [boilerplate]
>>>       FILTER NOT EXISTS {
>>>             GRAPH $shapesGraph { $in (rdf:rest*)/rdf:first ?value . }
>>>       }
>>>     }
>>>
>>>     sh:minLength
>>>
>>>     SELECT $this ?value WHERE {
>>>       [boilerplate]
>>>       FILTER NOT ( !isBlank(?value) && STRLEN(STR(?value)) >= 
>>> $minLength )
>>>     }
>>>
>>>     sh:maxLength is similar
>>>
>>>     sh:nodeKind
>>>
>>>     SELECT $this ?value WHERE {
>>>       [boilerplate]
>>>       FILTER NOT
>>>         ((isIRI(?value) && $nodeKind IN (sh:IRI, sh:BlankNodeOrIRI,
>>>     sh:IRIOrLiteral) ) ||
>>>          (isLiteral(?value) && $nodeKind IN ( sh:Literal,
>>>     sh:BlankNodeOrLiteral,
>>>     sh:IRIOrLiteral)) ||
>>>          (isBlank(?value) && $nodeKind IN ( sh:BlankNode,
>>>     sh:BlankNodeOrLiteral,
>>>     sh:BlankNodeOrLiteral)))
>>>     }
>>>
>>>     sh:pattern
>>>
>>>     SELECT $this ?value WHERE {
>>>       [boilerplate]
>>>       FILTER NOT (!isBlank(?value) && IF(bound($flags),
>>>                                          regex(str(?value), $pattern,
>>>     $flags),
>>>                                          regex(str(?value), $pattern)))
>>>     }
>>>
>>>     sh:stem
>>>
>>>     SELECT $this ?value WHERE {
>>>       [boilerplate]
>>>       FILTER NOT (isIRI(?value) && STRSTARTS(str(?value), $stem))
>>>     }
>>>
>>>     sh:shape
>>>
>>>     SELECT $this ?value ?failure WHERE {
>>>       [boilerplate]
>>>       BIND (sh:hasShape(?value, $shape, $shapesGraph) AS ?hasShape) .
>>>       BIND (!bound(?hasShape) AS ?failure) .
>>>       FILTER (?failure || !?hasShape) .
>>>     }
>>>
>>>     sh:hasValue
>>>
>>>     SELECT $this ?value WHERE {
>>>       FILTER NOT EXISTS {
>>>               [boilerplate]
>>>               FILTER (sameTerm(?value,$hasValue) )
>>>     }
>>>
>>>     sh:maxCount
>>>
>>>     SELECT SAMPLE($this) WHERE {
>>>       [boilerplate]
>>>     } HAVING ( COUNT ( DISTINCT ?value ) > $maxCount )
>>>
>>>     sh:minCount is similar
>>>
>>>     sh:equals
>>>
>>>     SELECT $this ?value WHERE {
>>>       {
>>>         [boilerplate]
>>>         MINUS { $this $equals ?value . }
>>>       } UNION {
>>>         $this $equals ?value .
>>>         MINUS  { [boilerplate] }
>>>       }
>>>     }
>>>
>>>     sh:disjoint
>>>
>>>     SELECT $this ?value WHERE {
>>>       [boilerplate]
>>>       $this $disjoint ?value .
>>>     }
>>>
>>>     sh:lessThan
>>>
>>>     SELECT $this ?value WHERE {
>>>       [boilerplate]
>>>       $this $lessThan ?value2 .
>>>       FILTER (!(?value < ?value2))
>>>     }
>>>
>>>     sh:lessThanOrEquals is similar
>>>
>>>     sh:uniqueLang
>>>
>>>     SELECT SAMPLE($this) ?lang WHERE {
>>>       { FILTER ($uniqueLang) }
>>>       [boilerplate]
>>>       BIND (lang(?value) AS ?lang)
>>>       FILTER (isLiteral(?value) && bound(?lang) && ?lang != "")
>>>     } GROUP BY ?lang HAVING ( COUNT(?this) > 1 )
>>>
>>>     sh:qualifiedMaxCount
>>>
>>>     SELECT SAMPLE($this) ( SUM(?failed)>0 AS ?failure ) WHERE {
>>>       [boilerplate]
>>>       BIND  (sh:hasShape(?value, $qualifiedValueShape, $shapesGraph)
>>>     AS ?hasShape)
>>>       BIND (!bound(?hasShape) AS ?failure)
>>>       BIND ( IF(?failure,1,0) AS ?failed )
>>>       FILTER ( IF(?failure, true, ?hasShape) )
>>>     } HAVING ( ( COUNT ( DISTINCT ?value ) > $qualifiedMaxCount ) ||
>>>                ( SUM(?failed) > 0 ) )
>>>
>>>     sh:qualifiedMinCount is similar
>>>
>>>     sh:closed
>>>
>>>     SELECT $this ?value WHERE {
>>>       { FILTER ($closed) }
>>>       [boilerplate]
>>>       ?value ?predicate ?object .
>>>       FILTER (NOT EXISTS {
>>>           GRAPH $shapesGraph { $currentShape sh:property/sh:predicate
>>>     ?predicate . }
>>>         } &&
>>>         ( !bound($ignoredProperties) ||
>>>           NOT EXISTS {
>>>             GRAPH $shapesGraph { $ignoredProperties
>>>     rdf:rest*/rdf:first ?predicate . }
>>>           }
>>>         ) )
>>>     }
>>>
>>>     sh:not
>>>
>>>     SELECT $this ?value ?failure WHERE {
>>>       [boilerplate]
>>>       BIND (sh:hasShape(?value, $not, $shapesGraph) AS ?hasShape) .
>>>       BIND (!bound(?hasShape) AS ?failure) .
>>>       FILTER (?failure || ?hasShape) .
>>>     }
>>>
>>>     sh:and
>>>
>>>     SELECT $this ?value ?failure WHERE {
>>>       [boilerplate]
>>>       GRAPH $shapesGraph { $and (rdf:rest*)/rdf:first ?conjunct . }
>>>       BIND (sh:hasShape(?value, ?conjunct, $shapesGraph) AS ?hasShape)
>>>       BIND (!bound(?hasShape) AS ?failure)
>>>       FILTER (?failure || !?hasShape)
>>>     }
>>>
>>>     sh:or
>>>
>>>     SELECT $this ?value WHERE {
>>>       [boilerplate]
>>>       FILTER NOT EXISTS {
>>>         GRAPH $shapesGraph { $or (rdf:rest*)/rdf:first ?disjunct . }
>>>         BIND (sh:hasShape(?value, ?disjunct, $shapesGraph) AS 
>>> ?hasShape)
>>>         BIND (!bound(?hasShape) AS ?failure)
>>>         FILTER ( !?failure || ?hasShape )
>>>       }
>>>     }
>>>
>>>
>>>     peter
>>>
>>>
>>>
>>>
>>> -- 
>>> Dimitris Kontokostas
>>> Department of Computer Science, University of Leipzig & DBpedia
>>> Association
>>> Projects: http://dbpedia.org, http://rdfunit.aksw.org,
>>> http://aligned-project.eu
>>> Homepage: http://aksw.org/DimitrisKontokostas
>>> Research Group: AKSW/KILT http://aksw.org/Groups/KILT
>>>
>>
>

Received on Wednesday, 15 June 2016 22:53:19 UTC