Re: shapes-ISSUE-63 (sh:hasShape): Nested shapes: sh:hasShape function versus recursive SPARQL code generation [SHACL Spec]

On 6/5/2015 4:55, Peter F. Patel-Schneider wrote:
> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA256
>
> One of my concerns with ISSUE-63 is that hasShape may not be implementable
> in many SPARQL systems.

Is is premature to come to a decision on this topic, but I doubt that 
your point above will outweigh the benefits of the sh:hasShape function. 
Yes, there may be some systems where this function is difficult to 
implement, but SPARQL functions are an official extension point of the 
standard, so I would expect that all proper SPARQL APIs provide such a 
hook. The only cases where such a hook would rely on the goodwill of a 
3rd party would be in closed-source database products.

We need to distinguish two layers here:

1) When SHACL is implemented with a generic graph-based API such as Jena 
or Sesame, and query execution happens against their graph interfaces, 
then the evaluation of functions happens over the iterators produced by 
the simple SPO queries, and is therefore under complete control of the 
Jena/Sesame implementations. This means that in principle we can provide 
SHACL implementations for every database on the planet, as long as they 
have Jena or Sesame drivers, or a SPARQL end point which we can treat as 
a SPO graph. This very much aligns with the notion of datasets.

2) If, for performance reasons, people want to execute SHACL natively on 
a database (which has all named graphs etc set up), then this database 
must already have SHACL implemented, including something like the 
validateNodeAgainstShape operation. If a vendor went through the effort 
of implementing this operation, then it is a trivial step to also expose 
this operation as a SPARQL function.

This leaves as the only interesting case the scenario where someone uses 
a generic API such as Jena but wants the queries to be natively executed 
on the target database, and the database does not support sh:hasShape. 
In this case, the (Jena) engine can apply some flattening algorithms 
similar to what you suggested in your draft, and eliminate the 
sh:hasShape function calls into a single large query. This should 
clearly be possible for many scenarios, esp for the core vocabulary (and 
sh:valueShape). Engines may want to optimize this anyway. However, I 
believe such optimizations should be out of scope for the WG, because we 
could quickly double the size of our documents and the complexity of the 
definitions. I find the current definitions using sh:hasShape very 
elegant and compact.

Holger

Received on Thursday, 4 June 2015 23:00:25 UTC