- From: Holger Knublauch <holger@topquadrant.com>
- Date: Fri, 22 May 2020 15:04:54 +1000
- To: Håvard M. Ottestad <hmottestad@gmail.com>
- Cc: public-shacl@w3.org
- Message-ID: <fc1b0bd4-3b2b-7831-3f92-345092ce083b@topquadrant.com>
On 21/05/2020 20:02, "Håvard M. Ottestad" wrote:
> Hi Holger and everyone else :)
>
> The targets for the TargetShape would be all subjects and all objects
> in the data graph.
In earlier versions of SHACL we had something like sh:target
sh:AllSubjects and sh:target sh:AllObjects. I have forgotten the details
but they were taken out soon afterwards and I moved them into the dash:
(datashapes.org) namespace while sh:target became a SHACL-AF term.
So I guess we could reintroduce something like those and also
sh:filterShape and would possibly have the most flexible solution? So
target nodes = targets of sh:targetXY filtered by sh:filterShape
A clever algorithm then has a declarative model to work with and may
quickly detect patterns like
|ex:EveryoneWhoKnowsThreePeopleMustKnowSteve a sh:NodeShape ; sh:target
sh:AllSubject, sh:AllObjects ; sh:filterShape [ sh:property [ sh:path
foaf:knows; sh:minCount 3 ; ] ] ; sh:property [ sh:path foaf:knows;
sh:hasValue ex:Steve; ] .|
Separating sh:AllSubjects and sh:AllObjects separately would offer more
flexibility too. (The case above can only be satisfied by subjects, so
why even bother about the objects?)
Holger
>
> The TargetShape would produce all subjects or objects in the data
> graph and considered valid according to the interpretation of the
> shape of the TargetShape.
>
> As a simple rule, a Shape with a clone of itself as the TargetShape
> would end up validating only targets that are known to be valid and
> would consequently return no violations.
>
> Håvard
>
>> On 21 May 2020, at 04:19, Holger Knublauch <holger@topquadrant.com>
>> wrote:
>>
>>
>>
>>
>> On 20/05/2020 22:23, Håvard Ottestad wrote:
>>> Hi,
>>>
>>> For the RDF4J SHACL implementation we would be able to much better
>>> optimise for something like filters than we ever could for SPARQL
>>> targets. Currently our benchmarks show that our custom targeting
>>> approach is considerably faster that SPARQL targets, milliseconds
>>> vs. seconds. This wouldn’t necessarily apply to other
>>> implementations though.
>>>
>>> My idea about using filters as SHACL advanced targets would look
>>> something like this:
>>>
>>> Shape explanation: Anyone who knows three or more people must also
>>> know Steve.
>>>
>>> |ex:EveryoneWhoKnowsThreePeopleMustKnowSteve a sh:Shape ; sh:target
>>> [ a sh:TargetShape ; sh:property [ sh:path foaf:knows; sh:minCount 3
>>> ; ] ] ; sh:property [ sh:path foaf:knows; sh:hasValue ex:Steve; ] .|
>>>
>>> Which would essentially have the same results as:
>>>
>>> |ex:EveryoneWhoKnowsThreePeopleMustKnowSteve a sh:Shape ;
>>> sh:targetSubjectsOf foaf:knows ; sh:or ( [ sh:path foaf:knows;
>>> sh:minCount 3; sh:hasValue ex:Steve; ] [ sh:path foaf:knows;
>>> sh:maxCount 2; ] ) .|
>>>
>>> Anyone think that this is a good (or maybe a particularly bad) idea?
>>
>> In general I agree that richer targets are needed. While there might
>> not be an official WG to produce such a thing, we as implementers
>> could establish a de-facto standard. I had designed sh:target to
>> serve as an extension point here, allowing custom systems to plug in
>> their own extensions. The use a single property (sh:target) at least
>> indicates to a processor that *some* target exists, so that it can at
>> least print a warning if it doesn't know what to do with it.
>>
>> Down the road, if we agree on something as fundamental as something
>> similar to filterShapes then we could introduce a new keyword such as
>> sh:targetNodesConforming which would take a shape declaration as its
>> value.
>>
>> My specific question (and I may be blind right now) is: what would be
>> the target nodes of the TargetShape in your example? Formally it
>> would need to be the set of all nodes in the universe, which doesn't
>> even exist. Without target nodes, most constraints cannot be
>> interpreted because they are formulated with a given focus node in mind.
>>
>> That's why reopening sh:filterShape might be a better approach. It
>> has the advantage that filters can be added to any shape including
>> shapes imported from a 3rd party, to narrow its targets down for a
>> specific application. I don't remember exactly why we dropped that.
>> Dimitris is correct that it was due to lack of time - there was quite
>> some panic at the end of the WG. The reason was probably the
>> complexity due to recursion. The minutes SHOULD have a resolution
>> which may explain more.
>>
>> Holger
>>
>>
>>>
>>> Håvard
>>>
>>>
>>>
>>>> On 20 May 2020, at 12:48, Varytimou, Natasa (Refinitiv)
>>>> <Natasa.Varytimou@refinitiv.com
>>>> <mailto:Natasa.Varytimou@refinitiv.com>> wrote:
>>>>
>>>> Hi all
>>>>
>>>> We also had a big performance issue with SHACL Sparql Targets which
>>>> are incredible useful.
>>>> Is there anything that can be done to improve performance?
>>>> And the same question for Filters ( which I support that are useful
>>>> to be included), will we have performance issues there as well?
>>>>
>>>>
>>>> -----Original Message-----
>>>> From: Håvard Ottestad <hmottestad@gmail.com
>>>> <mailto:hmottestad@gmail.com>>
>>>> Sent: 20 May 2020 11:25
>>>> To: Andy Seaborne <andy@apache.org <mailto:andy@apache.org>>
>>>> Cc: public-shacl@w3.org <mailto:public-shacl@w3.org>
>>>> Subject: Re: SHACL target extension
>>>>
>>>> Hi Andy and Dimitris
>>>>
>>>> Filters look like particularly useful constructs. They also look
>>>> very powerful, which is both good and bad.
>>>>
>>>> It’s quite close to what I want. I would want to have the filter
>>>> run on all nodes in the data graph, essentially a
>>>> sh:targetAllSubjects target. I think I saw something along those
>>>> lines already, but I couldn’t find it now while writing this email.
>>>>
>>>> I can see that a natural extension would be to allow filters to be
>>>> used as targets themselves, maybe through the SHACL Advanced
>>>> sh:target property.
>>>>
>>>> Håvard
>>>>
>>>>> On 20 May 2020, at 10:29, Andy Seaborne <andy@apache.org
>>>>> <mailto:andy@apache.org>> wrote:
>>>>>
>>>>> Nice!
>>>>>
>>>>> That would be a useful addition to SHACL both on targets and on
>>>>> property shapes. And for rules.
>>>>>
>>>>> Were there any other features that got dropped that the community
>>>>> might be interested in?
>>>>>
>>>>> Andy
>>>>>
>>>>>
>>>>>>> On 19/05/2020 22:29, Dimitris Kontokostas wrote:
>>>>>> Hi Håvard,
>>>>>> I think what you are after is something like the filter shape
>>>>>> feature
>>>>>> (https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fww
>>>>>> w.w3.org
>>>>>> <http://w.w3.org>%2FTR%2F2016%2FWD-shacl-201608H%25C3%25A5vard14%2F%23filterSh
>>>>>> ape&data=02%7C01%7CNatasa.Varytimou%40refinitiv.com
>>>>>> <http://40refinitiv.com>%7Ce6dc14407b7
>>>>>> 34d54b27908d7fca8327d%7C71ad2f6261e244fc9e8586c2827f6de9%7C0%7C0%7C63
>>>>>> 7255671647244813&sdata=9fkMStQVgoP4f8k3OKo4gaq4uampFgxMYbXuPjSH4q
>>>>>> A%3D&reserved=0
>>>>>> <https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fww
>>>>>> w.w3.org
>>>>>> <http://w.w3.org>%2FTR%2F2016%2FWD-shacl-20160814%2F%23filterShape&data=02
>>>>>> %7C01%7CNatasa.Varytimou%40refinitiv.com
>>>>>> <http://40refinitiv.com>%7Ce6dc14407b734d54b27908d7fc
>>>>>> a8327d%7C71ad2f6261e244fc9e8586c2827f6de9%7C0%7C0%7C63725567164724481
>>>>>> 3&sdata=dVeAUe7hfKbnliGxy5KVlNTE5Zs%2BE3f81z2GX%2BYFQfc%3D&re
>>>>>> served=0>) This is something that existed in the first versions of
>>>>>> SHACL but was dropped due to time restrictions near the end of
>>>>>> the WG
>>>>>> Best, Dimitris
>>>>>>> On Tue, May 19, 2020 at 11:05 PM Håvard Ottestad
>>>>>>> <hmottestad@gmail.com <mailto:hmottestad@gmail.com>
>>>>>>> <mailto:hmottestad@gmail.com>> wrote:
>>>>>> Hi James and Irene,
>>>>>> Thanks for the replies.
>>>>>> This is more a question of the standardisation aspect. Did anyone
>>>>>> discus including more elaborate target building blocks? There is
>>>>>> already sh:targetClass for rdf:type, but did anyone consider other
>>>>>> class constructs like skos:inScheme?
>>>>>> We already have two functional solutions within the current syntax:
>>>>>> - use sh:targetNode with sh:inverseProperty
>>>>>> - use SPARQL targets
>>>>>> The issue with these solutions are:
>>>>>> 1. Using sh:targetNode and sh:inverseProperty are much harder to
>>>>>> read than something like the compound target that we we
>>>>>> considering introducing.
>>>>>> 2. SPARQL targets take took long to evaluate for transactional
>>>>>> workloads.
>>>>>> Håvard
>>>>>>> On 19 May 2020, at 20:37, James Hudson
>>>>>>> <jameshudson3010@gmail.com <mailto:jameshudson3010@gmail.com>
>>>>>>> <mailto:jameshudson3010@gmail.com>> wrote:
>>>>>>> You may want to check out:
>>>>>>> https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstackoverflow.com%2Fquestions%2F61323857%2Fwhat-is-the-difference-between-these-shape-graphs-which-use-shor&data=02%7C01%7CNatasa.Varytimou%40refinitiv.com%7Ce6dc14407b734d54b27908d7fca8327d%7C71ad2f6261e244fc9e8586c2827f6de9%7C0%7C0%7C637255671647244813&sdata=kjaTfSE9gI524M8kypS5LzuVtajKVemL7vMOWxlHfEw%3D&reserved=0
>>>>>>> and
>>>>>>> https://nam12.safelinks.protection.outlook.com/?url=https%3A%2F%2Fstackoverflow.com%2Fquestions%2F61190422%2Fvalidating-that-every-subject-has-a-type-of-class&data=02%7C01%7CNatasa.Varytimou%40refinitiv.com%7Ce6dc14407b734d54b27908d7fca8327d%7C71ad2f6261e244fc9e8586c2827f6de9%7C0%7C0%7C637255671647244813&sdata=skAZd%2BVgGSCgD%2B5wEuPAuRV7XFUEcOlBJ78Ol0oWcGs%3D&reserved=0
>>>>>>> and other SHACL questions and answers I have on SO. They may help
>>>>>>> you out.
>>>>>>> As Irene already pointed out, SPARQL-based targets will solve
>>>>>>> your problem.
>>>>>>> On Tue, May 19, 2020 at 11:39 AM Håvard Ottestad
>>>>>>> <hmottestad@gmail.com <mailto:hmottestad@gmail.com>
>>>>>>> <mailto:hmottestad@gmail.com>> wrote:
>>>>>>> Hi,
>>>>>>> I’m the developer for the RDF4J SHACL implementation and we
>>>>>>> are looking into extending the targeting options in SHACL and
>>>>>>> are wondering if this is something that was discussed during
>>>>>>> the development of the standard or if anyone else has run
>>>>>>> into similar requirements.
>>>>>>> Essentially extending the current list of sh:targetNode,
>>>>>>> sh:targetClass, sh:targetSubjectsOf and sh:targetObjectsOf.
>>>>>>> Our use case can be summed up as.
>>>>>>> ex:Håvard ex:nationality ex:Norway;
>>>>>>> ex:norwegianID “12345612345”.
>>>>>>> Where we would essentially like to be able to add a shape
>>>>>>> that says that all Norwegian citizens should have a Norwegian
>>>>>>> ID number.
>>>>>>> We have been testing out the concept of a compound target.
>>>>>>> For our current tests we have used our own namespace like this:
>>>>>>> @prefix rdf4j-sh:
>>>>>>> <https://nam12.safelinks.protection.outlook.com/?url=http%3A%2F%2Frdf4j.org%2Fschema%2Frdf4j-shacl%23&data=02%7C01%7CNatasa.Varytimou%40refinitiv.com%7Ce6dc14407b734d54b27908d7fca8327d%7C71ad2f6261e244fc9e8586c2827f6de9%7C0%7C0%7C637255671647244813&sdata=ZmKYvssWhaW30oRKEkRqDpK6%2FizYr8tDe8xaPfdqvPc%3D&reserved=0>
>>>>>>> .
>>>>>>> ex:PersonShape
>>>>>>> a sh:NodeShape ;
>>>>>>> rdf4j-sh:compoundTarget [
>>>>>>> rdf4j-sh:targetPredicate ex:nationality;
>>>>>>> rdf4j-sh:targetObject ex:Norway
>>>>>>> ];
>>>>>>> sh:property [
>>>>>>> sh:path ex:norwegianID ;
>>>>>>> sh:minCount 1 ;
>>>>>>> sh:maxCount 1 ;
>>>>>>> ] .
>>>>>>> We have also been thinking about allowing
>>>>>>> rdf4j-sh:targetObject to be have multiple values.
>>>>>>> I also realise that it’s possible to use inversePath to solve
>>>>>>> this same problem, but I feel it becomes hard to read and
>>>>>>> grasp the intent.
>>>>>>> ex:PersonShape
>>>>>>> a sh:NodeShape ;
>>>>>>> sh:targetNode ex:Norway;
>>>>>>> sh:property [
>>>>>>> sh:path [sh:inversePath ex:nationality ];
>>>>>>> sh:property [
>>>>>>> sh:path ex:norwegianID ;
>>>>>>> sh:minCount 1 ;
>>>>>>> sh:maxCount 1 ;
>>>>>>> ]
>>>>>>> ] .
>>>>>>> Concurrently we have been testing the SHACL Advanced SPARQL
>>>>>>> targets. These allow us to do the same thing, but we are
>>>>>>> unable to achieve the same level of performance. In one of
>>>>>>> our benchmarks we see that SPARQL targets is 450x slower per
>>>>>>> transaction than compound targets. This is mostly due to our
>>>>>>> SHACL implementation being able to analyse the transactional
>>>>>>> changes and run a very minimal validation for compound
>>>>>>> targets. We do think that SPARQL targets could be
>>>>>>> considerably faster, but the design choices that allow for
>>>>>>> minimal transactional validation are currently also limiting
>>>>>>> our options for speeding up SPARQL targets.
>>>>>>> Does anyone know if this approach to a more flexible
>>>>>>> targeting has been considered as part of the spec? Or if
>>>>>>> someone has run into similar needs and is maybe considering
>>>>>>> implementing something similar.
>>>>>>> Cheers,
>>>>>>> Håvard
>>>>>> --
>>>>>> Kontokostas Dimitris
>>>>
>>>>
>>>
Received on Friday, 22 May 2020 05:05:14 UTC