- From: Holger Knublauch <holger@topquadrant.com>
- Date: Fri, 22 May 2020 15:04:54 +1000
- To: Håvard M. Ottestad <hmottestad@gmail.com>
- Cc: public-shacl@w3.org
- Message-ID: <fc1b0bd4-3b2b-7831-3f92-345092ce083b@topquadrant.com>
On 21/05/2020 20:02, "Håvard M. Ottestad" wrote: > Hi Holger and everyone else :) > > The targets for the TargetShape would be all subjects and all objects > in the data graph. In earlier versions of SHACL we had something like sh:target sh:AllSubjects and sh:target sh:AllObjects. I have forgotten the details but they were taken out soon afterwards and I moved them into the dash: (datashapes.org) namespace while sh:target became a SHACL-AF term. So I guess we could reintroduce something like those and also sh:filterShape and would possibly have the most flexible solution? So target nodes = targets of sh:targetXY filtered by sh:filterShape A clever algorithm then has a declarative model to work with and may quickly detect patterns like |ex:EveryoneWhoKnowsThreePeopleMustKnowSteve a sh:NodeShape ; sh:target sh:AllSubject, sh:AllObjects ; sh:filterShape [ sh:property [ sh:path foaf:knows; sh:minCount 3 ; ] ] ; sh:property [ sh:path foaf:knows; sh:hasValue ex:Steve; ] .| Separating sh:AllSubjects and sh:AllObjects separately would offer more flexibility too. (The case above can only be satisfied by subjects, so why even bother about the objects?) Holger > > The TargetShape would produce all subjects or objects in the data > graph and considered valid according to the interpretation of the > shape of the TargetShape. > > As a simple rule, a Shape with a clone of itself as the TargetShape > would end up validating only targets that are known to be valid and > would consequently return no violations. > > Håvard > >> On 21 May 2020, at 04:19, Holger Knublauch <holger@topquadrant.com> >> wrote: >> >> >> >> >> On 20/05/2020 22:23, Håvard Ottestad wrote: >>> Hi, >>> >>> For the RDF4J SHACL implementation we would be able to much better >>> optimise for something like filters than we ever could for SPARQL >>> targets. Currently our benchmarks show that our custom targeting >>> approach is considerably faster that SPARQL targets, milliseconds >>> vs. seconds. This wouldn’t necessarily apply to other >>> implementations though. >>> >>> My idea about using filters as SHACL advanced targets would look >>> something like this: >>> >>> Shape explanation: Anyone who knows three or more people must also >>> know Steve. >>> >>> |ex:EveryoneWhoKnowsThreePeopleMustKnowSteve a sh:Shape ; sh:target >>> [ a sh:TargetShape ; sh:property [ sh:path foaf:knows; sh:minCount 3 >>> ; ] ] ; sh:property [ sh:path foaf:knows; sh:hasValue ex:Steve; ] .| >>> >>> Which would essentially have the same results as: >>> >>> |ex:EveryoneWhoKnowsThreePeopleMustKnowSteve a sh:Shape ; >>> sh:targetSubjectsOf foaf:knows ; sh:or ( [ sh:path foaf:knows; >>> sh:minCount 3; sh:hasValue ex:Steve; ] [ sh:path foaf:knows; >>> sh:maxCount 2; ] ) .| >>> >>> Anyone think that this is a good (or maybe a particularly bad) idea? >> >> In general I agree that richer targets are needed. While there might >> not be an official WG to produce such a thing, we as implementers >> could establish a de-facto standard. I had designed sh:target to >> serve as an extension point here, allowing custom systems to plug in >> their own extensions. The use a single property (sh:target) at least >> indicates to a processor that *some* target exists, so that it can at >> least print a warning if it doesn't know what to do with it. >> >> Down the road, if we agree on something as fundamental as something >> similar to filterShapes then we could introduce a new keyword such as >> sh:targetNodesConforming which would take a shape declaration as its >> value. >> >> My specific question (and I may be blind right now) is: what would be >> the target nodes of the TargetShape in your example? Formally it >> would need to be the set of all nodes in the universe, which doesn't >> even exist. Without target nodes, most constraints cannot be >> interpreted because they are formulated with a given focus node in mind. >> >> That's why reopening sh:filterShape might be a better approach. It >> has the advantage that filters can be added to any shape including >> shapes imported from a 3rd party, to narrow its targets down for a >> specific application. I don't remember exactly why we dropped that. >> Dimitris is correct that it was due to lack of time - there was quite >> some panic at the end of the WG. The reason was probably the >> complexity due to recursion. The minutes SHOULD have a resolution >> which may explain more. >> >> Holger >> >> >>> >>> Håvard >>> >>> >>> >>>> On 20 May 2020, at 12:48, Varytimou, Natasa (Refinitiv) >>>> <Natasa.Varytimou@refinitiv.com >>>> <mailto:Natasa.Varytimou@refinitiv.com>> wrote: >>>> >>>> Hi all >>>> >>>> We also had a big performance issue with SHACL Sparql Targets which >>>> are incredible useful. >>>> Is there anything that can be done to improve performance? >>>> And the same question for Filters ( which I support that are useful >>>> to be included), will we have performance issues there as well? >>>> >>>> >>>> -----Original Message----- >>>> From: Håvard Ottestad <hmottestad@gmail.com >>>> <mailto:hmottestad@gmail.com>> >>>> Sent: 20 May 2020 11:25 >>>> To: Andy Seaborne <andy@apache.org <mailto:andy@apache.org>> >>>> Cc: public-shacl@w3.org <mailto:public-shacl@w3.org> >>>> Subject: Re: SHACL target extension >>>> >>>> Hi Andy and Dimitris >>>> >>>> Filters look like particularly useful constructs. They also look >>>> very powerful, which is both good and bad. >>>> >>>> It’s quite close to what I want. I would want to have the filter >>>> run on all nodes in the data graph, essentially a >>>> sh:targetAllSubjects target. I think I saw something along those >>>> lines already, but I couldn’t find it now while writing this email. >>>> >>>> I can see that a natural extension would be to allow filters to be >>>> used as targets themselves, maybe through the SHACL Advanced >>>> sh:target property. >>>> >>>> Håvard >>>> >>>>> On 20 May 2020, at 10:29, Andy Seaborne <andy@apache.org >>>>> <mailto:andy@apache.org>> wrote: >>>>> >>>>> Nice! >>>>> >>>>> That would be a useful addition to SHACL both on targets and on >>>>> property shapes. And for rules. >>>>> >>>>> Were there any other features that got dropped that the community >>>>> might be interested in? >>>>> >>>>> Andy >>>>> >>>>> >>>>>>> On 19/05/2020 22:29, Dimitris Kontokostas wrote: >>>>>> Hi Håvard, >>>>>> I think what you are after is something like the filter shape >>>>>> feature >>>>>> (https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fww >>>>>> w.w3.org >>>>>> <http://w.w3.org>%2FTR%2F2016%2FWD-shacl-201608H%25C3%25A5vard14%2F%23filterSh >>>>>> ape&data=02%7C01%7CNatasa.Varytimou%40refinitiv.com >>>>>> <http://40refinitiv.com>%7Ce6dc14407b7 >>>>>> 34d54b27908d7fca8327d%7C71ad2f6261e244fc9e8586c2827f6de9%7C0%7C0%7C63 >>>>>> 7255671647244813&sdata=9fkMStQVgoP4f8k3OKo4gaq4uampFgxMYbXuPjSH4q >>>>>> A%3D&reserved=0 >>>>>> <https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fww >>>>>> w.w3.org >>>>>> <http://w.w3.org>%2FTR%2F2016%2FWD-shacl-20160814%2F%23filterShape&data=02 >>>>>> %7C01%7CNatasa.Varytimou%40refinitiv.com >>>>>> <http://40refinitiv.com>%7Ce6dc14407b734d54b27908d7fc >>>>>> a8327d%7C71ad2f6261e244fc9e8586c2827f6de9%7C0%7C0%7C63725567164724481 >>>>>> 3&sdata=dVeAUe7hfKbnliGxy5KVlNTE5Zs%2BE3f81z2GX%2BYFQfc%3D&re >>>>>> served=0>) This is something that existed in the first versions of >>>>>> SHACL but was dropped due to time restrictions near the end of >>>>>> the WG >>>>>> Best, Dimitris >>>>>>> On Tue, May 19, 2020 at 11:05 PM Håvard Ottestad >>>>>>> <hmottestad@gmail.com <mailto:hmottestad@gmail.com> >>>>>>> <mailto:hmottestad@gmail.com>> wrote: >>>>>> Hi James and Irene, >>>>>> Thanks for the replies. >>>>>> This is more a question of the standardisation aspect. Did anyone >>>>>> discus including more elaborate target building blocks? There is >>>>>> already sh:targetClass for rdf:type, but did anyone consider other >>>>>> class constructs like skos:inScheme? >>>>>> We already have two functional solutions within the current syntax: >>>>>> - use sh:targetNode with sh:inverseProperty >>>>>> - use SPARQL targets >>>>>> The issue with these solutions are: >>>>>> 1. Using sh:targetNode and sh:inverseProperty are much harder to >>>>>> read than something like the compound target that we we >>>>>> considering introducing. >>>>>> 2. SPARQL targets take took long to evaluate for transactional >>>>>> workloads. >>>>>> Håvard >>>>>>> On 19 May 2020, at 20:37, James Hudson >>>>>>> <jameshudson3010@gmail.com <mailto:jameshudson3010@gmail.com> >>>>>>> <mailto:jameshudson3010@gmail.com>> wrote: >>>>>>> You may want to check out: >>>>>>> https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstackoverflow.com%2Fquestions%2F61323857%2Fwhat-is-the-difference-between-these-shape-graphs-which-use-shor&data=02%7C01%7CNatasa.Varytimou%40refinitiv.com%7Ce6dc14407b734d54b27908d7fca8327d%7C71ad2f6261e244fc9e8586c2827f6de9%7C0%7C0%7C637255671647244813&sdata=kjaTfSE9gI524M8kypS5LzuVtajKVemL7vMOWxlHfEw%3D&reserved=0 >>>>>>> and >>>>>>> https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstackoverflow.com%2Fquestions%2F61190422%2Fvalidating-that-every-subject-has-a-type-of-class&data=02%7C01%7CNatasa.Varytimou%40refinitiv.com%7Ce6dc14407b734d54b27908d7fca8327d%7C71ad2f6261e244fc9e8586c2827f6de9%7C0%7C0%7C637255671647244813&sdata=skAZd%2BVgGSCgD%2B5wEuPAuRV7XFUEcOlBJ78Ol0oWcGs%3D&reserved=0 >>>>>>> and other SHACL questions and answers I have on SO. They may help >>>>>>> you out. >>>>>>> As Irene already pointed out, SPARQL-based targets will solve >>>>>>> your problem. >>>>>>> On Tue, May 19, 2020 at 11:39 AM Håvard Ottestad >>>>>>> <hmottestad@gmail.com <mailto:hmottestad@gmail.com> >>>>>>> <mailto:hmottestad@gmail.com>> wrote: >>>>>>> Hi, >>>>>>> I’m the developer for the RDF4J SHACL implementation and we >>>>>>> are looking into extending the targeting options in SHACL and >>>>>>> are wondering if this is something that was discussed during >>>>>>> the development of the standard or if anyone else has run >>>>>>> into similar requirements. >>>>>>> Essentially extending the current list of sh:targetNode, >>>>>>> sh:targetClass, sh:targetSubjectsOf and sh:targetObjectsOf. >>>>>>> Our use case can be summed up as. >>>>>>> ex:Håvard ex:nationality ex:Norway; >>>>>>> ex:norwegianID “12345612345”. >>>>>>> Where we would essentially like to be able to add a shape >>>>>>> that says that all Norwegian citizens should have a Norwegian >>>>>>> ID number. >>>>>>> We have been testing out the concept of a compound target. >>>>>>> For our current tests we have used our own namespace like this: >>>>>>> @prefix rdf4j-sh: >>>>>>> <https://nam12.safelinks.protection.outlook.com/?url=http%3A%2F%2Frdf4j.org%2Fschema%2Frdf4j-shacl%23&data=02%7C01%7CNatasa.Varytimou%40refinitiv.com%7Ce6dc14407b734d54b27908d7fca8327d%7C71ad2f6261e244fc9e8586c2827f6de9%7C0%7C0%7C637255671647244813&sdata=ZmKYvssWhaW30oRKEkRqDpK6%2FizYr8tDe8xaPfdqvPc%3D&reserved=0> >>>>>>> . >>>>>>> ex:PersonShape >>>>>>> a sh:NodeShape ; >>>>>>> rdf4j-sh:compoundTarget [ >>>>>>> rdf4j-sh:targetPredicate ex:nationality; >>>>>>> rdf4j-sh:targetObject ex:Norway >>>>>>> ]; >>>>>>> sh:property [ >>>>>>> sh:path ex:norwegianID ; >>>>>>> sh:minCount 1 ; >>>>>>> sh:maxCount 1 ; >>>>>>> ] . >>>>>>> We have also been thinking about allowing >>>>>>> rdf4j-sh:targetObject to be have multiple values. >>>>>>> I also realise that it’s possible to use inversePath to solve >>>>>>> this same problem, but I feel it becomes hard to read and >>>>>>> grasp the intent. >>>>>>> ex:PersonShape >>>>>>> a sh:NodeShape ; >>>>>>> sh:targetNode ex:Norway; >>>>>>> sh:property [ >>>>>>> sh:path [sh:inversePath ex:nationality ]; >>>>>>> sh:property [ >>>>>>> sh:path ex:norwegianID ; >>>>>>> sh:minCount 1 ; >>>>>>> sh:maxCount 1 ; >>>>>>> ] >>>>>>> ] . >>>>>>> Concurrently we have been testing the SHACL Advanced SPARQL >>>>>>> targets. These allow us to do the same thing, but we are >>>>>>> unable to achieve the same level of performance. In one of >>>>>>> our benchmarks we see that SPARQL targets is 450x slower per >>>>>>> transaction than compound targets. This is mostly due to our >>>>>>> SHACL implementation being able to analyse the transactional >>>>>>> changes and run a very minimal validation for compound >>>>>>> targets. We do think that SPARQL targets could be >>>>>>> considerably faster, but the design choices that allow for >>>>>>> minimal transactional validation are currently also limiting >>>>>>> our options for speeding up SPARQL targets. >>>>>>> Does anyone know if this approach to a more flexible >>>>>>> targeting has been considered as part of the spec? Or if >>>>>>> someone has run into similar needs and is maybe considering >>>>>>> implementing something similar. >>>>>>> Cheers, >>>>>>> Håvard >>>>>> -- >>>>>> Kontokostas Dimitris >>>> >>>> >>>
Received on Friday, 22 May 2020 05:05:14 UTC