- From: Karen Coyle <kcoyle@kcoyle.net>
- Date: Thu, 16 Jun 2016 08:13:33 -0700
- To: public-data-shapes-wg@w3.org
I actually see that as a reason to separate the extension mechanism from
the core (into different documents) -- because the extension mechanism
is optional. We still have many problems in the core that need to be
worked on.
kc
On 6/15/16 3:52 PM, Holger Knublauch wrote:
> The topic of macros only concerns the SPARQL extension mechanism, which
> is part of SHACL.
>
> Holger
>
>
> On 16/06/2016 1:29, Karen Coyle wrote:
>> I don't see how the topic of macros is relevant to what we are doing
>> here, which is, as I recall, developing a *language* for validation of
>> rdf graphs. SPARQL was to be used as the formalism for defining the
>> actual constraints described "to the extent possible" but at no time
>> did we decide that the group was producing the code for a SHACL
>> implementation nor that SHACL would be written in SPARQL.
>>
>> kc
>>
>> On 6/14/16 3:57 PM, Holger Knublauch wrote:
>>> I am not keen on macros - people should still be able to understand
>>> what's happening, and have valid SPARQL queries in front of them.
>>> However, since the introduction of path constraints we are already in
>>> the code injection business anyway, so we could just as well do it
>>> right.
>>>
>>> The boilerplate solution is basically a generalization of the templates
>>> used for ASK queries. We all share the goal of reusing code. At the same
>>> time, I am against limiting what people can express because the
>>> framework would only support certain patterns. For example there are
>>> cases where queries need to iterate over the same value set twice
>>> (comparing values), or to do existential checks instead of allValues
>>> iteration, see sh:hasValue.
>>>
>>> With all this said, here is what I believe may work well. I have not
>>> tried this out yet, so I may be missing something.
>>>
>>> As placeholder for the path, let's use a special variable $path that can
>>> only be used as predicate of triple patterns. Depending on the context,
>>> the engine would replace those triple patterns with suitable code. For
>>> node constraints, this path is interpreted as "the empty path starting
>>> at the focus node, i.e. the focus node itself".
>>>
>>> The standard placeholder line would be
>>>
>>> $this $path ?value .
>>>
>>> which for example would be replaced with
>>>
>>> $this ex:parent ?value . (property)
>>> ?value ex:parent $this . (inv property)
>>> $this ex:parent/ex:parent ?value . (path)
>>> BIND ($this AS ?value) . (node constraint)
>>>
>>> but these variations would also be allowed:
>>>
>>> $this $path ?value2 .
>>>
>>> which would allow binding more than one variable. And
>>>
>>> $this $path $hasValue .
>>>
>>> to do an existential match. (For node constraints this could become a
>>> FILTER sameTerm etc)
>>>
>>> Maybe even
>>>
>>> ?value $path $this .
>>>
>>> to walk the inverse direction. A nice thing about this syntax is that we
>>> only need to switch from $predicate to $path in our SPARQL snippets
>>> (section 4).
>>>
>>> We can take this further and generalize the special status of $path to
>>> apply to other path structures such as sh:equals, sh:lessThan. This
>>> would allow all kinds of paths to be used together with no extra costs
>>> to the implementation. Such constraint parameters could be marked, e.g.
>>> with sh:shape sh:Path .
>>>
>>> This pattern language may cover a sufficiently large set of use cases,
>>> and we could use that as a starting point until someone comes up with
>>> scenarios that cannot be expressed this way. As with everything related
>>> to the extension mechanism, it is very hard to anticipate all possible
>>> use cases, yet here I am quite optimistic that a good middle ground is
>>> found that supports code generalization without sacrificing
>>> expressivity.
>>>
>>> Holger
>>>
>>>
>>> On 10/06/2016 16:32, Dimitris Kontokostas wrote:
>>>> if we go this way, should we define something like a macro that
>>>> replaces [boilerplate] to the actual boilerplate?
>>>> It would make the queries easier to read and harder to make copy/paste
>>>> mistakes
>>>> something like %%BOILETPLATE%% or ##BOILETPLATE##
>>>> the first one makes the query invalid as a SPARQL query before we
>>>> replace the boilerplate code while the latter can be seen as a comment
>>>> but has the risk of being removed by mistake
>>>>
>>>> Dimitris
>>>>
>>>> On Fri, Jun 10, 2016 at 4:44 AM, Peter F. Patel-Schneider
>>>> <pfpschneider@gmail.com> wrote:
>>>>
>>>> The boilerplate is SPARQL code what with $this, $context, and
>>>> $predicate
>>>> pre-bound and ?value not in scope produces solutions for ?value as
>>>> the value
>>>> nodes. It can either bind ?subject and ?object as appropriate or
>>>> these can
>>>> be determined by the governing code, which is the solution used
>>>> here. A
>>>> suitable boilerplate that does not use anything beyond pre-binding
>>>> is given
>>>> here, but it would also be possible to do something more
>>>> sophisticated.
>>>>
>>>> [boilerplate]
>>>>
>>>> { $this $predicate ?value .
>>>> FILTER ( sameTerm($context,sh:PropertyConstraint) )
>>>> } UNION {
>>>> ?value $predicate $this .
>>>> FILTER ( sameTerm($context,sh:InversePropertyConstraint) )
>>>> } UNION {
>>>> BIND ( $this AS ?value )
>>>> FILTER ( sameTerm($context,sh:NodeConstraint) )
>>>> }
>>>>
>>>> The code for the core constraint components is then as follows:
>>>>
>>>> sh:class
>>>>
>>>> SELECT $this ?value WHERE {
>>>> [boilerplate]
>>>> FILTER NOT EXISTS { ?value rdf:type/rdfs:subClassOf* $class }
>>>> }
>>>>
>>>> sh:classIn
>>>>
>>>> SELECT $this ?value WHERE {
>>>> [boilerplate]
>>>> FILTER NOT EXISTS {
>>>> GRAPH $shapesGraph { $classIn (rdf:rest*)/rdf:first
>>>> ?class . }
>>>> ?value rdf:type/rdfs:subClassOf* ?class .
>>>> }
>>>> }
>>>>
>>>> sh:datatype
>>>>
>>>> SELECT $this ?value WHERE {
>>>> [boilerplate]
>>>> FILTER NOT ( isLiteral(?value) && datatype(?value) = $datatype )
>>>> }
>>>>
>>>> sh:datatypeIn
>>>>
>>>> SELECT $this ?value WHERE {
>>>> [boilerplate]
>>>> FILTER NOT EXISTS {
>>>> GRAPH $shapesGraph { $classIn (rdf:rest*)/rdf:first
>>>> ?datatype . }
>>>> FILTER ( isLiteral(?value) && datatype(?value) =
>>>> ?datatype )
>>>> }
>>>> }
>>>>
>>>> sh:maxExclusive
>>>>
>>>> SELECT $this ?value WHERE {
>>>> [boilerplate]
>>>> FILTER (?value < $maxExclusive)
>>>> }
>>>>
>>>> sh:maxInclusive is similar
>>>> sh:minExclusive is similar
>>>> sh:minInclusive is similar
>>>>
>>>> sh:in
>>>>
>>>> SELECT $this ?value WHERE {
>>>> [boilerplate]
>>>> FILTER NOT EXISTS {
>>>> GRAPH $shapesGraph { $in (rdf:rest*)/rdf:first ?value . }
>>>> }
>>>> }
>>>>
>>>> sh:minLength
>>>>
>>>> SELECT $this ?value WHERE {
>>>> [boilerplate]
>>>> FILTER NOT ( !isBlank(?value) && STRLEN(STR(?value)) >=
>>>> $minLength )
>>>> }
>>>>
>>>> sh:maxLength is similar
>>>>
>>>> sh:nodeKind
>>>>
>>>> SELECT $this ?value WHERE {
>>>> [boilerplate]
>>>> FILTER NOT
>>>> ((isIRI(?value) && $nodeKind IN (sh:IRI, sh:BlankNodeOrIRI,
>>>> sh:IRIOrLiteral) ) ||
>>>> (isLiteral(?value) && $nodeKind IN ( sh:Literal,
>>>> sh:BlankNodeOrLiteral,
>>>> sh:IRIOrLiteral)) ||
>>>> (isBlank(?value) && $nodeKind IN ( sh:BlankNode,
>>>> sh:BlankNodeOrLiteral,
>>>> sh:BlankNodeOrLiteral)))
>>>> }
>>>>
>>>> sh:pattern
>>>>
>>>> SELECT $this ?value WHERE {
>>>> [boilerplate]
>>>> FILTER NOT (!isBlank(?value) && IF(bound($flags),
>>>> regex(str(?value), $pattern,
>>>> $flags),
>>>> regex(str(?value), $pattern)))
>>>> }
>>>>
>>>> sh:stem
>>>>
>>>> SELECT $this ?value WHERE {
>>>> [boilerplate]
>>>> FILTER NOT (isIRI(?value) && STRSTARTS(str(?value), $stem))
>>>> }
>>>>
>>>> sh:shape
>>>>
>>>> SELECT $this ?value ?failure WHERE {
>>>> [boilerplate]
>>>> BIND (sh:hasShape(?value, $shape, $shapesGraph) AS ?hasShape) .
>>>> BIND (!bound(?hasShape) AS ?failure) .
>>>> FILTER (?failure || !?hasShape) .
>>>> }
>>>>
>>>> sh:hasValue
>>>>
>>>> SELECT $this ?value WHERE {
>>>> FILTER NOT EXISTS {
>>>> [boilerplate]
>>>> FILTER (sameTerm(?value,$hasValue) )
>>>> }
>>>>
>>>> sh:maxCount
>>>>
>>>> SELECT SAMPLE($this) WHERE {
>>>> [boilerplate]
>>>> } HAVING ( COUNT ( DISTINCT ?value ) > $maxCount )
>>>>
>>>> sh:minCount is similar
>>>>
>>>> sh:equals
>>>>
>>>> SELECT $this ?value WHERE {
>>>> {
>>>> [boilerplate]
>>>> MINUS { $this $equals ?value . }
>>>> } UNION {
>>>> $this $equals ?value .
>>>> MINUS { [boilerplate] }
>>>> }
>>>> }
>>>>
>>>> sh:disjoint
>>>>
>>>> SELECT $this ?value WHERE {
>>>> [boilerplate]
>>>> $this $disjoint ?value .
>>>> }
>>>>
>>>> sh:lessThan
>>>>
>>>> SELECT $this ?value WHERE {
>>>> [boilerplate]
>>>> $this $lessThan ?value2 .
>>>> FILTER (!(?value < ?value2))
>>>> }
>>>>
>>>> sh:lessThanOrEquals is similar
>>>>
>>>> sh:uniqueLang
>>>>
>>>> SELECT SAMPLE($this) ?lang WHERE {
>>>> { FILTER ($uniqueLang) }
>>>> [boilerplate]
>>>> BIND (lang(?value) AS ?lang)
>>>> FILTER (isLiteral(?value) && bound(?lang) && ?lang != "")
>>>> } GROUP BY ?lang HAVING ( COUNT(?this) > 1 )
>>>>
>>>> sh:qualifiedMaxCount
>>>>
>>>> SELECT SAMPLE($this) ( SUM(?failed)>0 AS ?failure ) WHERE {
>>>> [boilerplate]
>>>> BIND (sh:hasShape(?value, $qualifiedValueShape, $shapesGraph)
>>>> AS ?hasShape)
>>>> BIND (!bound(?hasShape) AS ?failure)
>>>> BIND ( IF(?failure,1,0) AS ?failed )
>>>> FILTER ( IF(?failure, true, ?hasShape) )
>>>> } HAVING ( ( COUNT ( DISTINCT ?value ) > $qualifiedMaxCount ) ||
>>>> ( SUM(?failed) > 0 ) )
>>>>
>>>> sh:qualifiedMinCount is similar
>>>>
>>>> sh:closed
>>>>
>>>> SELECT $this ?value WHERE {
>>>> { FILTER ($closed) }
>>>> [boilerplate]
>>>> ?value ?predicate ?object .
>>>> FILTER (NOT EXISTS {
>>>> GRAPH $shapesGraph { $currentShape sh:property/sh:predicate
>>>> ?predicate . }
>>>> } &&
>>>> ( !bound($ignoredProperties) ||
>>>> NOT EXISTS {
>>>> GRAPH $shapesGraph { $ignoredProperties
>>>> rdf:rest*/rdf:first ?predicate . }
>>>> }
>>>> ) )
>>>> }
>>>>
>>>> sh:not
>>>>
>>>> SELECT $this ?value ?failure WHERE {
>>>> [boilerplate]
>>>> BIND (sh:hasShape(?value, $not, $shapesGraph) AS ?hasShape) .
>>>> BIND (!bound(?hasShape) AS ?failure) .
>>>> FILTER (?failure || ?hasShape) .
>>>> }
>>>>
>>>> sh:and
>>>>
>>>> SELECT $this ?value ?failure WHERE {
>>>> [boilerplate]
>>>> GRAPH $shapesGraph { $and (rdf:rest*)/rdf:first ?conjunct . }
>>>> BIND (sh:hasShape(?value, ?conjunct, $shapesGraph) AS ?hasShape)
>>>> BIND (!bound(?hasShape) AS ?failure)
>>>> FILTER (?failure || !?hasShape)
>>>> }
>>>>
>>>> sh:or
>>>>
>>>> SELECT $this ?value WHERE {
>>>> [boilerplate]
>>>> FILTER NOT EXISTS {
>>>> GRAPH $shapesGraph { $or (rdf:rest*)/rdf:first ?disjunct . }
>>>> BIND (sh:hasShape(?value, ?disjunct, $shapesGraph) AS
>>>> ?hasShape)
>>>> BIND (!bound(?hasShape) AS ?failure)
>>>> FILTER ( !?failure || ?hasShape )
>>>> }
>>>> }
>>>>
>>>>
>>>> peter
>>>>
>>>>
>>>>
>>>>
>>>> --
>>>> Dimitris Kontokostas
>>>> Department of Computer Science, University of Leipzig & DBpedia
>>>> Association
>>>> Projects: http://dbpedia.org, http://rdfunit.aksw.org,
>>>> http://aligned-project.eu
>>>> Homepage: http://aksw.org/DimitrisKontokostas
>>>> Research Group: AKSW/KILT http://aksw.org/Groups/KILT
>>>>
>>>
>>
>
>
>
--
Karen Coyle
kcoyle@kcoyle.net http://kcoyle.net
m: 1-510-435-8234
skype: kcoylenet/+1-510-984-3600
Received on Thursday, 16 June 2016 15:13:59 UTC