RE: SHACL target extension from Varytimou, Natasa (Refinitiv) on 2020-05-20 (public-shacl@w3.org from May 2020)

From: Varytimou, Natasa (Refinitiv) <Natasa.Varytimou@refinitiv.com>
Date: Wed, 20 May 2020 10:48:22 +0000
To: Håvard Ottestad <hmottestad@gmail.com>, Andy Seaborne <andy@apache.org>
CC: "public-shacl@w3.org" <public-shacl@w3.org>
Message-ID: <DM6PR06MB47328D8AA216381C08E01B0791B60@DM6PR06MB4732.namprd06.prod.outlook.com>
Hi all

We also had a big performance issue with SHACL Sparql Targets which are incredible useful.
Is there anything that can be done to improve performance?
And the same question for Filters ( which I support that are useful to be included), will we have performance issues there as well?


-----Original Message-----
From: Håvard Ottestad <hmottestad@gmail.com> 
Sent: 20 May 2020 11:25
To: Andy Seaborne <andy@apache.org>
Cc: public-shacl@w3.org
Subject: Re: SHACL target extension

Hi Andy and Dimitris

Filters look like particularly useful constructs. They also look very powerful, which is both good and bad.

It’s quite close to what I want. I would want to have the filter run on all nodes in the data graph, essentially a sh:targetAllSubjects target. I think I saw something along those lines already, but I couldn’t find it now while writing this email.

I can see that a natural extension would be to allow filters to be used as targets themselves, maybe through the SHACL Advanced sh:target property.

Håvard

> On 20 May 2020, at 10:29, Andy Seaborne <andy@apache.org> wrote:
>
> Nice!
>
> That would be a useful addition to SHACL both on targets and on property shapes. And for rules.
>
> Were there any other features that got dropped that the community might be interested in?
>
>    Andy
>
>
>>> On 19/05/2020 22:29, Dimitris Kontokostas wrote:
>> Hi Håvard,
>> I think what you are after is something like the filter shape feature 
>> (https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fww

>> w.w3.org%2FTR%2F2016%2FWD-shacl-201608H%25C3%25A5vard14%2F%23filterSh
>> ape&amp;data=02%7C01%7CNatasa.Varytimou%40refinitiv.com%7Ce6dc14407b7
>> 34d54b27908d7fca8327d%7C71ad2f6261e244fc9e8586c2827f6de9%7C0%7C0%7C63
>> 7255671647244813&amp;sdata=9fkMStQVgoP4f8k3OKo4gaq4uampFgxMYbXuPjSH4q
>> A%3D&amp;reserved=0 
>> <https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fww

>> w.w3.org%2FTR%2F2016%2FWD-shacl-20160814%2F%23filterShape&amp;data=02
>> %7C01%7CNatasa.Varytimou%40refinitiv.com%7Ce6dc14407b734d54b27908d7fc
>> a8327d%7C71ad2f6261e244fc9e8586c2827f6de9%7C0%7C0%7C63725567164724481
>> 3&amp;sdata=dVeAUe7hfKbnliGxy5KVlNTE5Zs%2BE3f81z2GX%2BYFQfc%3D&amp;re
>> served=0>) This is something that existed in the first versions of 
>> SHACL but was dropped due to time restrictions near the end of the WG 
>> Best, Dimitris
>>> On Tue, May 19, 2020 at 11:05 PM Håvard Ottestad <hmottestad@gmail.com <mailto:hmottestad@gmail.com>> wrote:
>>   Hi James and Irene,
>>   Thanks for the replies.
>>   This is more a question of the standardisation aspect. Did anyone
>>   discus including more elaborate target building blocks? There is
>>   already sh:targetClass for rdf:type, but did anyone consider other
>>   class constructs like skos:inScheme?
>>   We already have two functional solutions within the current syntax:
>>    - use sh:targetNode with sh:inverseProperty
>>    - use SPARQL targets
>>   The issue with these solutions are:
>>   1. Using sh:targetNode and sh:inverseProperty are much harder to
>>   read than something like the compound target that we we
>>   considering introducing.
>>   2. SPARQL targets take took long to evaluate for transactional
>>   workloads.
>>   Håvard
>>>   On 19 May 2020, at 20:37, James Hudson <jameshudson3010@gmail.com
>>>   <mailto:jameshudson3010@gmail.com>> wrote:
>>>   You may want to check out:
>>>   https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstackoverflow.com%2Fquestions%2F61323857%2Fwhat-is-the-difference-between-these-shape-graphs-which-use-shor&amp;data=02%7C01%7CNatasa.Varytimou%40refinitiv.com%7Ce6dc14407b734d54b27908d7fca8327d%7C71ad2f6261e244fc9e8586c2827f6de9%7C0%7C0%7C637255671647244813&amp;sdata=kjaTfSE9gI524M8kypS5LzuVtajKVemL7vMOWxlHfEw%3D&amp;reserved=0

>>>   and
>>>   https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstackoverflow.com%2Fquestions%2F61190422%2Fvalidating-that-every-subject-has-a-type-of-class&amp;data=02%7C01%7CNatasa.Varytimou%40refinitiv.com%7Ce6dc14407b734d54b27908d7fca8327d%7C71ad2f6261e244fc9e8586c2827f6de9%7C0%7C0%7C637255671647244813&amp;sdata=skAZd%2BVgGSCgD%2B5wEuPAuRV7XFUEcOlBJ78Ol0oWcGs%3D&amp;reserved=0

>>>   and other SHACL questions and answers I have on SO. They may help
>>>   you out.
>>>   As Irene already pointed out, SPARQL-based targets will solve
>>>   your problem.
>>>   On Tue, May 19, 2020 at 11:39 AM Håvard Ottestad
>>>   <hmottestad@gmail.com <mailto:hmottestad@gmail.com>> wrote:
>>>       Hi,
>>>       I’m the developer for the RDF4J SHACL implementation and we
>>>       are looking into extending the targeting options in SHACL and
>>>       are wondering if this is something that was discussed during
>>>       the development of the standard or if anyone else has run
>>>       into similar requirements.
>>>       Essentially extending the current list of sh:targetNode,
>>>       sh:targetClass, sh:targetSubjectsOf and sh:targetObjectsOf.
>>>       Our use case can be summed up as.
>>>       ex:Håvard ex:nationality ex:Norway;
>>>           ex:norwegianID “12345612345”.
>>>       Where we would essentially like to be able to add a shape
>>>       that says that all Norwegian citizens should have a Norwegian
>>>       ID number.
>>>       We have been testing out the concept of a compound target.
>>>       For our current tests we have used our own namespace like this:
>>>       @prefix rdf4j-sh: <https://nam12.safelinks.protection.outlook.com/?url=http%3A%2F%2Frdf4j.org%2Fschema%2Frdf4j-shacl%23&amp;data=02%7C01%7CNatasa.Varytimou%40refinitiv.com%7Ce6dc14407b734d54b27908d7fca8327d%7C71ad2f6261e244fc9e8586c2827f6de9%7C0%7C0%7C637255671647244813&amp;sdata=ZmKYvssWhaW30oRKEkRqDpK6%2FizYr8tDe8xaPfdqvPc%3D&amp;reserved=0> .
>>>       ex:PersonShape
>>>              a sh:NodeShape  ;
>>>              rdf4j-sh:compoundTarget [
>>>                      rdf4j-sh:targetPredicate ex:nationality;
>>>                      rdf4j-sh:targetObject ex:Norway
>>>              ];
>>>              sh:property [
>>>                     sh:path ex:norwegianID ;
>>>                     sh:minCount 1 ;
>>>                     sh:maxCount 1 ;
>>>              ] .
>>>       We have also been thinking about allowing
>>>       rdf4j-sh:targetObject to be have multiple values.
>>>       I also realise that it’s possible to use inversePath to solve
>>>       this same problem, but I feel it becomes hard to read and
>>>       grasp the intent.
>>>       ex:PersonShape
>>>              a sh:NodeShape  ;
>>>              sh:targetNode ex:Norway;
>>>              sh:property [
>>>                     sh:path [sh:inversePath ex:nationality ];
>>>                     sh:property [
>>>                       sh:path ex:norwegianID ;
>>>                       sh:minCount 1 ;
>>>                       sh:maxCount 1 ;
>>>                     ]
>>>              ] .
>>>       Concurrently we have been testing the SHACL Advanced SPARQL
>>>       targets. These allow us to do the same thing, but we are
>>>       unable to achieve the same level of performance. In one of
>>>       our benchmarks we see that SPARQL targets is 450x slower per
>>>       transaction than compound targets. This is mostly due to our
>>>       SHACL implementation being able to analyse the transactional
>>>       changes and run a very minimal validation for compound
>>>       targets. We do think that SPARQL targets could be
>>>       considerably faster, but the design choices that allow for
>>>       minimal transactional validation are currently also limiting
>>>       our options for speeding up SPARQL targets.
>>>       Does anyone know if this approach to a more flexible
>>>       targeting has been considered as part of the spec? Or if
>>>       someone has run into similar needs and is maybe considering
>>>       implementing something similar.
>>>       Cheers,
>>>       Håvard
>> --
>> Kontokostas Dimitris
Received on Thursday, 21 May 2020 12:11:32 UTC