Re: SHACL target extension from Holger Knublauch on 2020-06-11 (public-shacl@w3.org from June 2020)

From: Holger Knublauch <holger@topquadrant.com>
Date: Thu, 11 Jun 2020 17:16:43 +1000
To: Håvard Ottestad <hmottestad@gmail.com>
Cc: Public Shacl W3C <public-shacl@w3.org>, Vladimir Alexiev <vladimir.alexiev@ontotext.com>
Message-ID: <2bc6d3bf-ad73-4387-9097-ad55af7a8ab0@topquadrant.com>
On 11/06/2020 16:15, Håvard Ottestad wrote:

> Hi,
>
> A quick question Holger.
>
> You said "I would however introduce a new property instead of 
> sh:target, because the meaning of sh:target would otherwise be 
> overloaded and it is possible for targets to also be sh:NodeShapes in 
> which case the result will be very surprising. So, IMHO it should be 
> something like sh:targetShape (or the earlier, verbose 
> sh:targetNodesConforming).”
>
> Do you have any examples of where someone would already be using a 
> sh:NodeShape in sh:target?
>
> Would you reject this proposal based on that?

This is not for me to decide, SHACL is a group effort. Let's try to find 
a good compromise though :)

The case of custom targets that are also node shapes is unlikely in 
practice, albeit theoretically possible. But a stronger reason is that 
we would still overload the meaning of an already defined term. There is 
no reason to overload sh:target, other than that it would only require 
adding a paragraph under an existing section, and maybe that no other 
term needs to be introduced. However, sh:target is a SHACL-AF feature 
while I assume we want sh:targetShape to become a Core feature. This 
alone is a strong incentive for a new name, alongside the four existing 
Core target types. All IMHO of course.

Holger


> I can then think of three solutions:
>
> 1. sh:targetShape (your proposal)
> 2. a new subclass sh:TargetNodeShap rdfs:subClassOf sh:NodeShap. Eg. 
> sh:target [a sh:TargetNodeShape; ….]
> 3. a clean expansion on sh:target like how SPARQL targets work. Eg. 
> sh:target [a sh:ShapeTarget; sh:shape ex:nodeShape1]
>
> Håvard
>
>> On 5 Jun 2020, at 18:18, Vladimir Alexiev 
>> <vladimir.alexiev@ontotext.com 
>> <mailto:vladimir.alexiev@ontotext.com>> wrote:
>>
>> Hi Holger! Thanks for the comments!
>>
>>     introduce a new property instead of
>>     sh:target, because the meaning of sh:target would otherwise be
>>     overloaded and it is possible for targets to also be sh:NodeShapes 
>>
>>
>> SHACL-AF says "The algorithm that is used for this computation 
>> depends on the rdf:type of the custom target (sh:target)",
>> and then specifies two such types (sh:SPARQLTarget and 
>> sh:SPARQLTargetType).
>> My proposal is to use exactly sh:NodeShape as rdf:type, because we've 
>> described targeting by node shape.
>> I don't see why it's confusing to use the same sh:NodeShape for both 
>> targeting and its normal purpose (validation),
>> and it's important for us to be able to reuse shapes in this way (see 
>> the last 2 examples).
>>
>>     IMHO it should be something like sh:targetShape
>>
>>
>> I'd be fine with this (as soon as we stick with type sh:NodeShape) 
>> but don't see why it's needed:
>> - my proposal: sh:target [a sh:NodeShape; ...]
>> - your proposal: sh:targetShape [a sh:NodeShape; ...]
>>
>> sh:target is polymorphic by SHACL-AF definition, so I don't see why 
>> we need a specialized prop name.
>>
>>     I remain very nervous about performance implications.
>>
>>
>> That was also my concern because we're paying Havard to implement 
>> what we need for the Onto platform,
>> which is a limited targeting (conjunction of disjunction of hasValue).
>> But Havard assures us that he's already implemented more generic 
>> targeting
>> (though still not full SHACL shapes! there's only atomic sh:path)
>> and that it's efficient.
>>
>> Havard has answered with a lot more detail about performance.
>>
>> I'll add some warning that such targeting is potentially expensive, 
>> and users must be careful when using it, and check with their 
>> specific SHACL implementation.
>>
>>     "is node N in the target of S" requires iterating over all
>>     sh:targetShapes each time. This can be very expensive.
>>
>>
>> Yes, that's also a concern and we'll give Havard sizable schemas (say 
>> 100 shapes, and each node matches say 5-10 shapes, being the depth of 
>> the 'semantic type hierarchy").
>>
>>     The implementation cost of this feature is significant, because it
>>     requires the implementation of an "inverse validation" algorithm.
>>     Validation starts with a focus node and returns a result.
>>
>>
>> In rdf4j, validation starts with a transaction, assuming that 
>> data-at-rest is valid.
>> I believe Havard can "index" all the targeting shapes, so it's 
>> efficient to check all of them over the set of nodes in the transaction.
>>
>>     guess most of them are hard to execute in the inverse order:
>>     sh:datatype, sh:nodeKind, sh:minExclusive etc, sh:minLength etc,
>>     sh:pattern, sh:languageIn, sh:uniqueLang, sh:lessThan etc, sh:closed,
>>
>>
>> You're right in many cases.
>> Any user who selects nodes by strlen is shooting himself in the foot.
>> So we better put in some warnings which constructs it's wise to use 
>> in a target shape, and which ones are stupid.
>>
>>     So what if we simply introduce a new target type sh:targetHasValue V
>>     where the targets can be identified by a direct look-up. For example
>>
>>     ex:KiwiShape
>>          sh:targetHasValue [
>>              sh:path ex:nationality ;
>>              sh:hasValue ex:NewZealand ;
>>
>>
>> We need somewhat more though:
>>
>> ex:PoliticianShape a sh:NodeShape;
>>   sh:semanticTarget (
>>     [sh:path rdf:type; valueIn (dbo:Person schema:Person)]
>>     [sh:path dt:type; valueIn ("politician" "president")]
>>   );
>>
>> That's what I started with, but then you guys said "filter shapes are 
>> very useful", so I wrote up the more general case.
>>
>
Received on Thursday, 11 June 2020 07:17:01 UTC