Re: shapes-ISSUE-30 (shape-and-data-graphs): Are shapes and data in the same graph? [SHACL Spec] from Holger Knublauch on 2015-04-09 (public-data-shapes-wg@w3.org from April 2015)

From: Holger Knublauch <holger@topquadrant.com>
Date: Thu, 09 Apr 2015 10:07:49 +1000
To: public-data-shapes-wg <public-data-shapes-wg@w3.org>
Message-ID: <5525C2D5.9080408@topquadrant.com>
On 3/31/2015 20:22, Dimitris Kontokostas wrote:
> I am skeptical about this feature. I cannot work if for instance you 
> want to validate a SPARQL endpoint against a set of shapes not stored 
> directly in the endpoint.

Do we even support SPARQL endpoints at this stage? I believe all we ever 
talked about were RDF datasets. As discussed in [1] an RDF dataset is a 
collection of RDF graphs. SPARQL endpoints are invoked quite differently 
via a network protocol, and as you point out, SPARQL end points are 
one-way-streets that cannot necessarily query the other graphs in the 
dataset that the SHACL engine is started with.

I do not want to discourage this use case, and know that APIs such as 
Jena do have a concept of remote query execution. But for now the 
assumption was that we are dealing with Graphs (similar to the Jena 
Graph API), that can have basic SPO queries and may have SPARQL support.

I believe it requires an explicit decision by the WG to support SPARQL 
end points as validation targets, and the usual process would be user 
story -> requirement -> approval/rejection. Until that, we should assume 
simplicity and only talk about datasets. Note that you could define a 
virtual named graph that wraps a SPARQL end point, yet this graph would 
need to fulfill the usual contracts.

Holger

[1] http://www.w3.org/TR/rdf11-datasets/

> If we allow this the difference should be somehow obvious, e.g. as a 
> separate (sub)property.
>
> We also need to make clear that we check only the existence of the 
> direct literal values or IRIs and we do not perform subgraph comparison
>
> e.g.
> # SHACL Graph - simplified
> sh:property [
>  sh:predicate ex:p1
>  sh:allowedValues ex:a
> ]
> ex:a ex:p2 ex:b
>
> # Data Graph - simplified
>
> ex:X ex:p1 ex:a
> ex:a ex:p2 ex:c
>
> This should pass since we do not check for the existence of "ex:a 
> ex:p2 ex:b" in the data
>
>
> On Tue, Mar 31, 2015 at 2:06 AM, Holger Knublauch 
> <holger@topquadrant.com <mailto:holger@topquadrant.com>> wrote:
>
>     FWIW my prototype uses two properties
>
>     - sh:include (points from a graph to another data graph to include
>     similar to owl:imports)
>     - sh:library (points from a graph to a graph with constraint
>     definitions
>
>     These properties could be used to dynamically determine which
>     other subgraphs need to be added into the query graph (visible in
>     the WHERE clauses by default), versus the "compile-time" graph
>     that is needed to figure out which constraints to validate at all.
>
>     This assumes that a typical set up is roughly like
>
>     my:InstanceGraph
>         sh:include/owl:imports my:ClassGraph (may include constraints)
>             sh:library SHACL namespace + other template libraries etc
>
>     In that case, the query graph would consist of the first two
>     graphs only, but exclude those graphs referenced via sh:library only.
>
>     Another set up would be
>
>     my:InstanceGraph
>         sh:include/owl:imports my:ClassGraph (just the schema)
>         sh:library my:ConstraintsGraph (constraints for schema)
>             sh:include my:ClassGraph
>
>     in which case the query graph would consist of the first two
>     graphs only, while my:ConstraintsGraph would only be visible at
>     "compile time".
>
>     I had hinted at such a solution when I wrote down the requirements
>
>     https://www.w3.org/2014/data-shapes/wiki/Requirements#Organizing_Constraints_in_Named_Graphs
>     https://www.w3.org/2014/data-shapes/wiki/Requirements#Including_Named_Graphs_for_Query_Evaluation
>
>     which did not get any support yet. Maybe this problem was deemed
>     too far down, and I am still not convinced that now is the best
>     time to discuss these details - I believe we have more critical
>     questions to answer first.
>
>     Holger
>
>
>
>     On 3/31/2015 3:11, Karen Coyle wrote:
>
>
>
>         On 3/29/15 3:49 PM, Richard Cyganiak wrote:
>         This though assumes that you have control over the instance
>         data, which is not always the case. So although this may work
>         for some applications, others will be operating over data
>         created by third parties who have their own data model. I
>         mention this just so we can keep in mind that we have both
>         situations to address.
>
>
>             I don't follow. Why does the described design require that
>             I have control of the instance data, and why wouldn't it
>             work with third-party data?
>
>             Richard
>
>
>         Richard, I may have misunderstood your example, but the
>         situation I am referring to is one in which you are unlikely
>         to know what graphs are used in someone else's instance data,
>         but you still need to validate properties and values.
>
>         kc
>
>
>
>
>
>
> -- 
> Dimitris Kontokostas
> Department of Computer Science, University of Leipzig
> Research Group: http://aksw.org
> Homepage:http://aksw.org/DimitrisKontokostas
Received on Thursday, 9 April 2015 00:09:14 UTC