Re: resolving ISSUE-47: Can SPARQL-based constraints access the shape graph, and how? from Dimitris Kontokostas on 2015-06-12 (public-data-shapes-wg@w3.org from June 2015)

From: Dimitris Kontokostas <kontokostas@informatik.uni-leipzig.de>
Date: Fri, 12 Jun 2015 08:52:04 +0300
To: Holger Knublauch <holger@topquadrant.com>
Cc: public-data-shapes-wg <public-data-shapes-wg@w3.org>
Message-ID: <CA+u4+a1ib7gJDYzNh_LoOMJg-y7zfuSenF9AOxANR1519_qziQ@mail.gmail.com>
On Fri, Jun 12, 2015 at 2:57 AM, Holger Knublauch <holger@topquadrant.com>
wrote:

>  On 6/12/2015 8:14, Dimitris Kontokostas wrote:
>
>
>
> On Thu, Jun 11, 2015 at 11:18 PM, Holger Knublauch <holger@topquadrant.com
> > wrote:
>
>> On 6/12/15 5:51 AM, Dimitris Kontokostas wrote:
>>
>>>
>>> Summing up from the meeting, the whole core language can be implemented
>>> without access to the constraints graph.
>>>
>>>
>>  Could you clarify or give examples? I believe this requires a
>> substantial change from a template-based generic approach to a solution in
>> which these core templates require a different, hard-coded mechanism to
>> produce SPARQL queries. I would find the latter very ugly and it adds to
>> the implementation burden.
>
>
>  Implementers who do not want to support SPARQL Endpoints can do
> optimizations like the ones in the current spec.
>
>
> Yes, possibly. Still the overall spec becomes more complex because we need
> to introduce hard-coded patterns.
>
>
>
>> One of the very strengths of RDF and OWL is that people can use the same
>> mechanism to query data and ontology, and here we are simply allowing that
>> same principle.
>
>
>  I agree and this is the reason we define SHACL in RDF. However, I don't
> think ontologies are required to exist / queried together with data.
>
>
>>
>> Some parts of the spec would indeed be simpler with access to the
>>> constraints graph but I don't see this alone as a reason to require access.
>>>
>>> I suggest we gather all the use cases where access is needed beyond the
>>> core language and evaluate them.
>>>
>>>
>>  One case is recursion (e.g. sh:valueShape). How could this be
>> implemented without a function such as sh:hasShape, which takes a shape as
>> a parameter?
>>
>
>  There is no resolution for recursion yet.
>
>
> Yes, and here things become intertwined. We could try to vote on the
> recursion issue beforehand. If we vote in favor of recursion with
> sh:hasShape, then the implication is that we must also vote in favor of
> ?shapesGraph access. This is my preferred outcome. However, due to these
> implications, some people may vote against recursion (like Peter did in the
> past) to make sure that a code-generation approach always remains a viable
> option.
>
> To break this deadlock, I can only try my best to work with you on your
> performance concerns in the dbpedia use case. One obvious solution for you
> would be to wait for Virtuoso's native SHACL support - assuming OpenLink is
> committed to this, which I cannot comment on. AFAIK dbpedia is already
> running on their platform, so you would just need to create a named graph
> for your shapes on the same Virtuoso instance, and let the engine do all
> the rest natively. This would be by far the fastest solution anyway.
> Meanwhile, you may need to live with slower performance for certain cases.
>
> Having said this, we should look at which cases we are really talking
> about now. We agree that any SHACL implementation could optimize the core
> templates such as sh:ClosedShape, so that they no longer require access to
> the shapes graph. This is relatively straight-forward to implement in Java.
> So the only case where you wouldn't get the native execution speed would be
> for custom templates or constraints. Assuming you have these constraints
> under your control, you can easily make sure that they don't use
> ?shapesGraph. So the only problem case is if
> - you need to talk to a SPARQL end point
> - the SPARQL end point doesn't have the shapes graph as a named graph
> - you have queries that need to access ?shapesGraph
>
> To me this feels rather like a corner case, that shouldn't force our hands
> with the design of the whole spec. If SHACL is as successful as we all
> hope, then most vendors will sooner or later be forced to add native SHACL
> support, through market pressure. If it isn't successful, then, well, you
> have the fallback of using the same hand-written SPARQL queries that you
> currently have, and bypass the SHACL machinery.
>
> Furthermore, did we even formally decide that SPARQL end points are
> supported? I believe we only ever talked about datasets, and in that case
> the SPARQL end point would have to be wrapped into a Graph (with SPO
> queries).
>
>   Besides this, are there any other use cases beyond the core language?
>
>
> As discussed: any use case with sh:hasShape. But of course it's a
> nice-to-have feature. People will not even think about what they could do,
> if we don't allow this feature. I would be curious to observe what our user
> base will produce with this feature - it is very powerful!
>

If there is no other strong evidence that access is required I suggest a
resolution with more or less the following wording:
The SPARQL based constraints cannot access the shapes graph. which is
inline with the issue description.
The issue does not refer to the core language and we can create new issues
for core after e.g. we have a resolution for recursion. This also leaves
open how the spec is written.
If we agree on this, implementers can optimize certain parts of core to
perform better since arbitrary access is not allowed. I also assume we will
all try and make SHACL core as SPARQL Endpoint  friendly as possible

Best,
Dimitris


>
>
> Holger
>
>
>
>
>>
>> As Peter stated, if we don't allow access to the shapes graph, then a lot
>> of things in the current design would need to change. I think we should
>> take a hard look at the counter arguments before making such a change. I
>> accept there may be performance implications for scenarios such as remote
>> SPARQL execution against large databases, but in those cases there are
>> work-arounds such as generating optimized SPARQL queries. Many engines may
>> decide to implement optimizations for the core language anyway. But since
>> these are performance optimizations only, I don't think they should limit
>> what the general spec allows.
>
>
>  Querying big SPARQL endpoints is already a slow process and with this
> approach SHACL makes it even slower.
> We have a few SPARQL vendors in the WG, I am wondering about their opinion
> on this.
>
>  Dimitris
>
>
>>
>>
>> Holger
>>
>>
>>
>
>
>  --
>    Dimitris Kontokostas
> Department of Computer Science, University of Leipzig & DBpedia
> Association
> Projects: http://dbpedia.org, http://http://aligned-project.eu,
> http://rdfunit.aksw.org
> Homepage:http://aksw.org/DimitrisKontokostas
>  Research Group: http://aksw.org
>
>
>


-- 
Dimitris Kontokostas
Department of Computer Science, University of Leipzig & DBpedia Association
Projects: http://dbpedia.org, http://http://aligned-project.eu,
http://rdfunit.aksw.org
Homepage:http://aksw.org/DimitrisKontokostas
Research Group: http://aksw.org
Received on Friday, 12 June 2015 05:53:00 UTC