Re: ISSUE-23: A specific proposal from Holger Knublauch on 2015-04-27 (public-data-shapes-wg@w3.org from April 2015)

From: Holger Knublauch <holger@topquadrant.com>
Date: Mon, 27 Apr 2015 11:23:22 +1000
To: RDF Data Shapes Working Group <public-data-shapes-wg@w3.org>
Message-ID: <553D8F8A.1020709@topquadrant.com>
On 4/27/2015 9:34, Michel Dumontier wrote:
>
> On Sun, Apr 26, 2015 at 3:47 PM, Holger Knublauch 
> <holger@topquadrant.com <mailto:holger@topquadrant.com>> wrote:
>
>     Michel,
>
>     SHACL validation is launched over a dataset that has a default
>     graph - a collection of triples that have been selected by
>     whatever mechanism. This includes the possibility for an
>     application to build a (possibly temporary) default graph
>     containing additional triples about a IRI not owned by you. So
>     technically, this provides quite a bit of flexibility. I agree
>     this practice would not necessarily translate well to a pure
>     Linked Data approach where all information about a resource needs
>     to reside at its URI.
>
>     On 4/27/2015 6:59, Michel Dumontier wrote:
>>     It's fine to make a reference to a resource, but it's not OK to
>>     make assertions where the subject IRI is not owned by you. for
>>     instance, using the modeling I advocate,
>>
>>     bob's file:
>>     bob:Issue
>>       a owl:Class .
>>
>>     holger's file, where he modifies the meaning of bob:Issue directly:
>>     bob:Issue
>>     sh:property [
>>     sh:predicate ex:assignedTo ;
>>     rdfs:label "assigned to" ;
>>     rdfs:comment "The assignee of an issue must be a person." ;
>>     sh:maxCount 1 ;
>>     sh:valueType schema:Person ;
>>     ] .
>>
>>     michel's file, where he creates a new shape to validate that
>>     bob:Issues satisfy a particular constraint:
>>
>>     michel:IssueShape
>>       rdfs:subClassOf bob:Issue
>>       sh:property [
>>     sh:predicate ex:assignedTo ;
>>     rdfs:label "assigned to" ;
>>     rdfs:comment "The assignee of an issue must be a person." ;
>>     sh:maxCount 1 ;
>>     sh:valueType schema:Person ;
>>     ] .
>
>     Both approaches look fine to me. In the second approach you would
>     still need to instruct the engine that all instances of Issue need
>     to be verified against your extended IssueShape. This is covered
>     by the existing rdf:type triples in the upper example. How would
>     you like to represent this in the second approach?
>
>
> I guess this goes back to the comment i made a while ago ... with some 
> kind of selector it would be more obvious how to choose the triples 
> for validation.

I remember an earlier discussion with you that resulted in me adding 
sh:scope to constraints:

     http://w3c.github.io/data-shapes/shacl/#scope

and I like such a feature because it adds a consistent 
filtering/pre-condition mechanism to constraints. It appears that you 
would also like to have a similar mechanism for Shapes, see below.

> For instance, something like:
>
> :IssueShape
>   a sh:Shape;
>   sh:selector [
>      sh:property rdf:type;
>      sh:valueType :Issue .
>  ]
>
>  I like this because it gives us the power to generate more complex 
> selectors. Otherwise, we'd might just specify to look for the 
> rdfs:subClassOf predicate on the shape (obviously, we'd need to type 
> our shape to sh:Shape here too).
>   what do you think?

Yes this may work. We can tweak the metamodel in various ways to allow 
such things. Indeed we had previous discussions in the group, e.g. the 
idea of general Shape Selectors [1]. I also believe that Peter's 
proposal [2] has coverage for such general selection. I would be happy 
to add such features into the language if there is sufficient agreement 
on this direction within the group. Clearly this should be in addition 
to selection based on rdf:type and sh:nodeShape as outlined in my 
current draft [3] - we don't want to over-complicate the simple use 
cases. Neither do we want to overload the language with too many modes 
of operation.

There is one performance issue to consider though, if we allow shapes to 
specify under which conditions they apply. Example: if the user is 
editing a form of a specific resource, and wants to perform validation 
of her edits, then only that resource needs to be validated - not the 
whole graph. This would be similar to the validateNode operation [4]. In 
the case of general selectors, this means that the system would need to 
walk through all defined shapes and check their selectors. In large 
models this may be quite time consuming. In the case of rdf:type and 
sh:nodeShape, the situation is simpler because the system only needs to 
look at a small number of shapes.

Thanks,
Holger

[1] https://www.w3.org/2014/data-shapes/wiki/Shape_Selectors
[2] https://www.w3.org/2014/data-shapes/wiki/Shacl-sparql
[3] http://w3c.github.io/data-shapes/shacl/#shape-selection
[4] http://w3c.github.io/data-shapes/shacl/#operation-validateNode
Received on Monday, 27 April 2015 01:25:02 UTC