Re: resolving ISSUE-47: Can SPARQL-based constraints access the shape graph, and how? from Holger Knublauch on 2015-06-14 (public-data-shapes-wg@w3.org from June 2015)

From: Holger Knublauch <holger@topquadrant.com>
Date: Mon, 15 Jun 2015 09:11:56 +1000
To: public-data-shapes-wg <public-data-shapes-wg@w3.org>
Message-ID: <557E0A3C.4080403@topquadrant.com>
On 6/14/15 10:56 PM, Dimitris Kontokostas wrote:
> Maybe I am wrong but I think you are aiming for corner cases. I asked 
> if there are use cases for access beyond core and you didn't say 
> anything besides a "nice-to-have feature" and that you "would be 
> curious to observe what our user base will produce with this".

Why should the core language have a special status here? The constructs 
of the core language are just one selection of common use cases - other 
people will add their own. Currently the following core language 
constructs are defined using ?shapesGraph: sh:allowedValues, 
sh:valueShape, sh:NotConstraint, sh:AndConstraint, sh:OrConstraint, 
sh:XorConstraint, sh:ClosedShape, plus the (not yet approved) qualified 
cardinality constraints. Any recursive definition would require this. 
These are not corner cases at all but just random examples of real-world 
scenarios.

Here is another case, for named graph access in general. Assume you want 
to validate that certain terms from a given query graph are also present 
as SKOS concepts in some other graph. That SKOS named graph is not 
accessible to the server running dbpedia. With the general SPARQL 
endpoint scenario this is not implementable unless the endpoint can call 
out to external named graphs. So endpoints are very limited already, but 
this limitation shouldn't propagate into every use case.

> On the other hand, I think everyone agrees that allowing arbitrary 
> access is problematic in immutable RDF datasets.

There are work-arounds for these cases, by wrapping the immutable 
datasets with a virtual dataset. OTOH you didn't mention work-arounds 
for the recursion issue yet, nor SHACL functions nor blank node 
treatment. Instead you are suggesting a lowest-common-denominator 
approach in which everyone can only use the features that the weakest 
link can also support. This is IMHO a design mistake.

We already have a separation between (ShEx) engines that don't want to 
support SPARQL. In the worst case, we may also have modes that don't 
support ?shapesGraph, while allowing others to use that feature.

> We had a similar case for global constraints where we could find a 
> couple of examples but decided to drop them for the same reason.
>
> In case it wasn't clear, my suggested resolution refers _only_ to the 
> SPARQL extension mechanism _beyond the core language_. What I suggest 
> does not refer to SHACL core or the spec.
> Of course, I would be happy to re-consider if there is evidence that 
> this feature is indeed needed.

So did you say that ?shapesGraph can still be used inside of the core 
language, i.e. the engine would have to support it anyway? This would be 
better, but then why not also allow it in general? Queries that use 
?shapesGraph are easy to spot and engines can fall back to the default 
(e.g. by using Jena's own SPARQL engine). Nobody is forced to use 
?shapesGraph.

>
> Also note that although SPARQL Endpoints are one case where access is 
> problematic, there can be many others when e.g. we apply SHACL in a 
> distributed processing system or a map-reduce step where we'll have 
> the  same limitations.

Even the distributed scenarios would require named graph support, e.g. 
when someone has a GRAPH ?x statement in their SPARQL. So the main 
differentiator seems to be immutability, i.e. the case in which it is 
impossible to have the shapes graph as one named graph in the same dataset.

What do you think about my proposal to specify the SPARQL generation 
rules in a separate document or chapter? Wouldn't this be a compromise 
that addresses all use cases?

Thanks
Holger
Received on Sunday, 14 June 2015 23:12:32 UTC