- From: Karen Coyle <kcoyle@kcoyle.net>
- Date: Thu, 16 Jun 2016 08:13:33 -0700
- To: public-data-shapes-wg@w3.org
I actually see that as a reason to separate the extension mechanism from the core (into different documents) -- because the extension mechanism is optional. We still have many problems in the core that need to be worked on. kc On 6/15/16 3:52 PM, Holger Knublauch wrote: > The topic of macros only concerns the SPARQL extension mechanism, which > is part of SHACL. > > Holger > > > On 16/06/2016 1:29, Karen Coyle wrote: >> I don't see how the topic of macros is relevant to what we are doing >> here, which is, as I recall, developing a *language* for validation of >> rdf graphs. SPARQL was to be used as the formalism for defining the >> actual constraints described "to the extent possible" but at no time >> did we decide that the group was producing the code for a SHACL >> implementation nor that SHACL would be written in SPARQL. >> >> kc >> >> On 6/14/16 3:57 PM, Holger Knublauch wrote: >>> I am not keen on macros - people should still be able to understand >>> what's happening, and have valid SPARQL queries in front of them. >>> However, since the introduction of path constraints we are already in >>> the code injection business anyway, so we could just as well do it >>> right. >>> >>> The boilerplate solution is basically a generalization of the templates >>> used for ASK queries. We all share the goal of reusing code. At the same >>> time, I am against limiting what people can express because the >>> framework would only support certain patterns. For example there are >>> cases where queries need to iterate over the same value set twice >>> (comparing values), or to do existential checks instead of allValues >>> iteration, see sh:hasValue. >>> >>> With all this said, here is what I believe may work well. I have not >>> tried this out yet, so I may be missing something. >>> >>> As placeholder for the path, let's use a special variable $path that can >>> only be used as predicate of triple patterns. Depending on the context, >>> the engine would replace those triple patterns with suitable code. For >>> node constraints, this path is interpreted as "the empty path starting >>> at the focus node, i.e. the focus node itself". >>> >>> The standard placeholder line would be >>> >>> $this $path ?value . >>> >>> which for example would be replaced with >>> >>> $this ex:parent ?value . (property) >>> ?value ex:parent $this . (inv property) >>> $this ex:parent/ex:parent ?value . (path) >>> BIND ($this AS ?value) . (node constraint) >>> >>> but these variations would also be allowed: >>> >>> $this $path ?value2 . >>> >>> which would allow binding more than one variable. And >>> >>> $this $path $hasValue . >>> >>> to do an existential match. (For node constraints this could become a >>> FILTER sameTerm etc) >>> >>> Maybe even >>> >>> ?value $path $this . >>> >>> to walk the inverse direction. A nice thing about this syntax is that we >>> only need to switch from $predicate to $path in our SPARQL snippets >>> (section 4). >>> >>> We can take this further and generalize the special status of $path to >>> apply to other path structures such as sh:equals, sh:lessThan. This >>> would allow all kinds of paths to be used together with no extra costs >>> to the implementation. Such constraint parameters could be marked, e.g. >>> with sh:shape sh:Path . >>> >>> This pattern language may cover a sufficiently large set of use cases, >>> and we could use that as a starting point until someone comes up with >>> scenarios that cannot be expressed this way. As with everything related >>> to the extension mechanism, it is very hard to anticipate all possible >>> use cases, yet here I am quite optimistic that a good middle ground is >>> found that supports code generalization without sacrificing >>> expressivity. >>> >>> Holger >>> >>> >>> On 10/06/2016 16:32, Dimitris Kontokostas wrote: >>>> if we go this way, should we define something like a macro that >>>> replaces [boilerplate] to the actual boilerplate? >>>> It would make the queries easier to read and harder to make copy/paste >>>> mistakes >>>> something like %%BOILETPLATE%% or ##BOILETPLATE## >>>> the first one makes the query invalid as a SPARQL query before we >>>> replace the boilerplate code while the latter can be seen as a comment >>>> but has the risk of being removed by mistake >>>> >>>> Dimitris >>>> >>>> On Fri, Jun 10, 2016 at 4:44 AM, Peter F. Patel-Schneider >>>> <pfpschneider@gmail.com> wrote: >>>> >>>> The boilerplate is SPARQL code what with $this, $context, and >>>> $predicate >>>> pre-bound and ?value not in scope produces solutions for ?value as >>>> the value >>>> nodes. It can either bind ?subject and ?object as appropriate or >>>> these can >>>> be determined by the governing code, which is the solution used >>>> here. A >>>> suitable boilerplate that does not use anything beyond pre-binding >>>> is given >>>> here, but it would also be possible to do something more >>>> sophisticated. >>>> >>>> [boilerplate] >>>> >>>> { $this $predicate ?value . >>>> FILTER ( sameTerm($context,sh:PropertyConstraint) ) >>>> } UNION { >>>> ?value $predicate $this . >>>> FILTER ( sameTerm($context,sh:InversePropertyConstraint) ) >>>> } UNION { >>>> BIND ( $this AS ?value ) >>>> FILTER ( sameTerm($context,sh:NodeConstraint) ) >>>> } >>>> >>>> The code for the core constraint components is then as follows: >>>> >>>> sh:class >>>> >>>> SELECT $this ?value WHERE { >>>> [boilerplate] >>>> FILTER NOT EXISTS { ?value rdf:type/rdfs:subClassOf* $class } >>>> } >>>> >>>> sh:classIn >>>> >>>> SELECT $this ?value WHERE { >>>> [boilerplate] >>>> FILTER NOT EXISTS { >>>> GRAPH $shapesGraph { $classIn (rdf:rest*)/rdf:first >>>> ?class . } >>>> ?value rdf:type/rdfs:subClassOf* ?class . >>>> } >>>> } >>>> >>>> sh:datatype >>>> >>>> SELECT $this ?value WHERE { >>>> [boilerplate] >>>> FILTER NOT ( isLiteral(?value) && datatype(?value) = $datatype ) >>>> } >>>> >>>> sh:datatypeIn >>>> >>>> SELECT $this ?value WHERE { >>>> [boilerplate] >>>> FILTER NOT EXISTS { >>>> GRAPH $shapesGraph { $classIn (rdf:rest*)/rdf:first >>>> ?datatype . } >>>> FILTER ( isLiteral(?value) && datatype(?value) = >>>> ?datatype ) >>>> } >>>> } >>>> >>>> sh:maxExclusive >>>> >>>> SELECT $this ?value WHERE { >>>> [boilerplate] >>>> FILTER (?value < $maxExclusive) >>>> } >>>> >>>> sh:maxInclusive is similar >>>> sh:minExclusive is similar >>>> sh:minInclusive is similar >>>> >>>> sh:in >>>> >>>> SELECT $this ?value WHERE { >>>> [boilerplate] >>>> FILTER NOT EXISTS { >>>> GRAPH $shapesGraph { $in (rdf:rest*)/rdf:first ?value . } >>>> } >>>> } >>>> >>>> sh:minLength >>>> >>>> SELECT $this ?value WHERE { >>>> [boilerplate] >>>> FILTER NOT ( !isBlank(?value) && STRLEN(STR(?value)) >= >>>> $minLength ) >>>> } >>>> >>>> sh:maxLength is similar >>>> >>>> sh:nodeKind >>>> >>>> SELECT $this ?value WHERE { >>>> [boilerplate] >>>> FILTER NOT >>>> ((isIRI(?value) && $nodeKind IN (sh:IRI, sh:BlankNodeOrIRI, >>>> sh:IRIOrLiteral) ) || >>>> (isLiteral(?value) && $nodeKind IN ( sh:Literal, >>>> sh:BlankNodeOrLiteral, >>>> sh:IRIOrLiteral)) || >>>> (isBlank(?value) && $nodeKind IN ( sh:BlankNode, >>>> sh:BlankNodeOrLiteral, >>>> sh:BlankNodeOrLiteral))) >>>> } >>>> >>>> sh:pattern >>>> >>>> SELECT $this ?value WHERE { >>>> [boilerplate] >>>> FILTER NOT (!isBlank(?value) && IF(bound($flags), >>>> regex(str(?value), $pattern, >>>> $flags), >>>> regex(str(?value), $pattern))) >>>> } >>>> >>>> sh:stem >>>> >>>> SELECT $this ?value WHERE { >>>> [boilerplate] >>>> FILTER NOT (isIRI(?value) && STRSTARTS(str(?value), $stem)) >>>> } >>>> >>>> sh:shape >>>> >>>> SELECT $this ?value ?failure WHERE { >>>> [boilerplate] >>>> BIND (sh:hasShape(?value, $shape, $shapesGraph) AS ?hasShape) . >>>> BIND (!bound(?hasShape) AS ?failure) . >>>> FILTER (?failure || !?hasShape) . >>>> } >>>> >>>> sh:hasValue >>>> >>>> SELECT $this ?value WHERE { >>>> FILTER NOT EXISTS { >>>> [boilerplate] >>>> FILTER (sameTerm(?value,$hasValue) ) >>>> } >>>> >>>> sh:maxCount >>>> >>>> SELECT SAMPLE($this) WHERE { >>>> [boilerplate] >>>> } HAVING ( COUNT ( DISTINCT ?value ) > $maxCount ) >>>> >>>> sh:minCount is similar >>>> >>>> sh:equals >>>> >>>> SELECT $this ?value WHERE { >>>> { >>>> [boilerplate] >>>> MINUS { $this $equals ?value . } >>>> } UNION { >>>> $this $equals ?value . >>>> MINUS { [boilerplate] } >>>> } >>>> } >>>> >>>> sh:disjoint >>>> >>>> SELECT $this ?value WHERE { >>>> [boilerplate] >>>> $this $disjoint ?value . >>>> } >>>> >>>> sh:lessThan >>>> >>>> SELECT $this ?value WHERE { >>>> [boilerplate] >>>> $this $lessThan ?value2 . >>>> FILTER (!(?value < ?value2)) >>>> } >>>> >>>> sh:lessThanOrEquals is similar >>>> >>>> sh:uniqueLang >>>> >>>> SELECT SAMPLE($this) ?lang WHERE { >>>> { FILTER ($uniqueLang) } >>>> [boilerplate] >>>> BIND (lang(?value) AS ?lang) >>>> FILTER (isLiteral(?value) && bound(?lang) && ?lang != "") >>>> } GROUP BY ?lang HAVING ( COUNT(?this) > 1 ) >>>> >>>> sh:qualifiedMaxCount >>>> >>>> SELECT SAMPLE($this) ( SUM(?failed)>0 AS ?failure ) WHERE { >>>> [boilerplate] >>>> BIND (sh:hasShape(?value, $qualifiedValueShape, $shapesGraph) >>>> AS ?hasShape) >>>> BIND (!bound(?hasShape) AS ?failure) >>>> BIND ( IF(?failure,1,0) AS ?failed ) >>>> FILTER ( IF(?failure, true, ?hasShape) ) >>>> } HAVING ( ( COUNT ( DISTINCT ?value ) > $qualifiedMaxCount ) || >>>> ( SUM(?failed) > 0 ) ) >>>> >>>> sh:qualifiedMinCount is similar >>>> >>>> sh:closed >>>> >>>> SELECT $this ?value WHERE { >>>> { FILTER ($closed) } >>>> [boilerplate] >>>> ?value ?predicate ?object . >>>> FILTER (NOT EXISTS { >>>> GRAPH $shapesGraph { $currentShape sh:property/sh:predicate >>>> ?predicate . } >>>> } && >>>> ( !bound($ignoredProperties) || >>>> NOT EXISTS { >>>> GRAPH $shapesGraph { $ignoredProperties >>>> rdf:rest*/rdf:first ?predicate . } >>>> } >>>> ) ) >>>> } >>>> >>>> sh:not >>>> >>>> SELECT $this ?value ?failure WHERE { >>>> [boilerplate] >>>> BIND (sh:hasShape(?value, $not, $shapesGraph) AS ?hasShape) . >>>> BIND (!bound(?hasShape) AS ?failure) . >>>> FILTER (?failure || ?hasShape) . >>>> } >>>> >>>> sh:and >>>> >>>> SELECT $this ?value ?failure WHERE { >>>> [boilerplate] >>>> GRAPH $shapesGraph { $and (rdf:rest*)/rdf:first ?conjunct . } >>>> BIND (sh:hasShape(?value, ?conjunct, $shapesGraph) AS ?hasShape) >>>> BIND (!bound(?hasShape) AS ?failure) >>>> FILTER (?failure || !?hasShape) >>>> } >>>> >>>> sh:or >>>> >>>> SELECT $this ?value WHERE { >>>> [boilerplate] >>>> FILTER NOT EXISTS { >>>> GRAPH $shapesGraph { $or (rdf:rest*)/rdf:first ?disjunct . } >>>> BIND (sh:hasShape(?value, ?disjunct, $shapesGraph) AS >>>> ?hasShape) >>>> BIND (!bound(?hasShape) AS ?failure) >>>> FILTER ( !?failure || ?hasShape ) >>>> } >>>> } >>>> >>>> >>>> peter >>>> >>>> >>>> >>>> >>>> -- >>>> Dimitris Kontokostas >>>> Department of Computer Science, University of Leipzig & DBpedia >>>> Association >>>> Projects: http://dbpedia.org, http://rdfunit.aksw.org, >>>> http://aligned-project.eu >>>> Homepage: http://aksw.org/DimitrisKontokostas >>>> Research Group: AKSW/KILT http://aksw.org/Groups/KILT >>>> >>> >> > > > -- Karen Coyle kcoyle@kcoyle.net http://kcoyle.net m: 1-510-435-8234 skype: kcoylenet/+1-510-984-3600
Received on Thursday, 16 June 2016 15:13:59 UTC