- From: Holger Knublauch <holger@topquadrant.com>
- Date: Fri, 03 Jul 2015 10:05:54 +1000
- To: public-data-shapes-wg <public-data-shapes-wg@w3.org>
The message from today's call was that we need to try to make progress on the ?shapesGraph question via email. So here we go again. I believe we need to rephrase the question that we are trying to resolve. Instead of being about ?shapesGraph access only, this is more about how SHACL can interoperate best with SPARQL endpoints for external databases that have no direct Graph interface. Since every SPARQL endpoint can be turned into a virtual graph with SPO queries, the question can be further narrowed down how to *optimize* performance against SPARQL endpoints, i.e. get along with as few queries as possible. My proposal for this is ISSUE-71, i.e. a network protocol that just requires a single transaction. Assuming we agree this is useful, the remaining question becomes how to make sure that SPARQL endpoints that do not support the SHACL protocol yet have decent performance. And of those use cases, we only talk about scenarios where (for some reason) the person issuing the constraints is not able to control which constraints get written - nobody forces you to use ?shapesGraph or user-defined functions in your queries for example. All this is necessarily only for a subset of SHACL. For example SPARQL endpoints have no notion of other execution languages like JavaScript. There is no mechanism to declare new functions on SPARQL endpoints. There is no way to ask a SPARQL endpoint to validate a given blank node that was previously returned (because the IDs are different each time). Recursion is highly questionable - I remember Arthur stating that recursion doesn't require shapes graph access, but then I assume you are back to making multiple calls to the SPARQL endpoint, not a single query (please send details). There is no way to control which named graphs are accessible from the endpoint, or even whether the endpoint supports all of the required SPARQL features and entailments. And we would need to either disallow ?shapesGraph access in general or reject constraints that use it. I cannot accept that this constrained environment shall dictate how the whole spec is written. Are these particular scenarios involving SPARQL endpoints really *that* important, especially given that there are plenty of alternative ways of talking to databases? Also note that many people have moved away from SPARQL endpoints due to their infamous unreliability. Anyway, since we are already talking about a small slice of possible architectures, I think the best way forward is to - Make that subset explicit, e.g. call it SHACL-SE - Put warning signs around features that are outside of SHACL-SE - Thus minimize the risk that SPARQL endpoints have reduced performance because the SHACL engine may need to fallback to its own local SPARQL engine. This is similar to Arnaud's option c), to make certain features optional, and we will be able to deliver a FPWD soon, without spending months on the drawing board with something like Peter's SPARQL-only proposal that has never been implemented or tested, has serious performance problems, covers only a subset of approved requirements and yet will lead to a vastly more complex spec because we need to look into all kinds of SPARQL string generation problems. The situation has parallels with OWL DL versus Full. In OWL DL there was a group of vocal people arguing that we need to save the world from certain worst-case scenarios that break their algorithms. An OWL Full scare campaign followed. Up to this date, some tools refuse to process OWL Full. In practice however, many people use OWL Full features all the time without negative consequences. Thankfully, the OWL group at least permitted OWL Full to be explored on the market. We should allow the same evolution to happen here. If the SPARQL endpoint folks are serious about performance they should support the consistent architecture proposed in ISSUE-71 instead of taking away the flexibility for everyone else. Could those who voted for option b) (no ?shapesGraph support) please explain why they cannot live with my proposal above? Nobody forces you to use ?shapesGraph in your SPARQL constraints (if you are using SPARQL at all), and it is easy to detect this variable in 3rd party queries, so what is the big deal here? Thanks, Holger http://www.w3.org/2014/data-shapes/track/issues/71
Received on Friday, 3 July 2015 00:06:29 UTC