Re: SHACL target extension from Vladimir Alexiev on 2020-06-11 (public-shacl@w3.org from June 2020)

From: Vladimir Alexiev <vladimir.alexiev@ontotext.com>
Date: Thu, 11 Jun 2020 11:04:54 +0300
To: Holger Knublauch <holger@topquadrant.com>
Cc: Håvard Ottestad <hmottestad@gmail.com>, Public Shacl W3C <public-shacl@w3.org>
Message-ID: <CAMv+wg4g86LToe5_qtV61Jq=vPHDGo=Uq0m-yXt-MEEzYfErow@mail.gmail.com>
I'll change my proposal to use sh:targetShape and submit a PR against
shacl-af (because Core can't be changed).

Should sh:targetShape be a subprop of sh:target?

PS: we do intend to reuse the same "reference" shapes for both targeting
and validating the existence ("semantic type"l of nodes.

On Thu, Jun 11, 2020, 10:16 Holger Knublauch <holger@topquadrant.com> wrote:

> On 11/06/2020 16:15, Håvard Ottestad wrote:
>
> Hi,
>
> A quick question Holger.
>
> You said "I would however introduce a new property instead of sh:target,
> because the meaning of sh:target would otherwise be overloaded and it is
> possible for targets to also be sh:NodeShapes in which case the result will
> be very surprising. So, IMHO it should be something like sh:targetShape (or
> the earlier, verbose sh:targetNodesConforming).”
>
> Do you have any examples of where someone would already be using a
> sh:NodeShape in sh:target?
>
> Would you reject this proposal based on that?
>
> This is not for me to decide, SHACL is a group effort. Let's try to find a
> good compromise though :)
>
> The case of custom targets that are also node shapes is unlikely in
> practice, albeit theoretically possible. But a stronger reason is that we
> would still overload the meaning of an already defined term. There is no
> reason to overload sh:target, other than that it would only require adding
> a paragraph under an existing section, and maybe that no other term needs
> to be introduced. However, sh:target is a SHACL-AF feature while I assume
> we want sh:targetShape to become a Core feature. This alone is a strong
> incentive for a new name, alongside the four existing Core target types.
> All IMHO of course.
>
> Holger
>
>
> I can then think of three solutions:
>
> 1. sh:targetShape (your proposal)
> 2. a new subclass sh:TargetNodeShap rdfs:subClassOf sh:NodeShap. Eg.
> sh:target [a sh:TargetNodeShape; ….]
> 3. a clean expansion on sh:target like how SPARQL targets work. Eg.
> sh:target [a sh:ShapeTarget; sh:shape ex:nodeShape1]
>
> Håvard
>
> On 5 Jun 2020, at 18:18, Vladimir Alexiev <vladimir.alexiev@ontotext.com>
> wrote:
>
> Hi Holger! Thanks for the comments!
>
> introduce a new property instead of
>> sh:target, because the meaning of sh:target would otherwise be
>> overloaded and it is possible for targets to also be sh:NodeShapes
>
>
> SHACL-AF says "The algorithm that is used for this computation depends on
> the rdf:type of the custom target (sh:target)",
> and then specifies two such types (sh:SPARQLTarget and
> sh:SPARQLTargetType).
> My proposal is to use exactly sh:NodeShape as rdf:type, because we've
> described targeting by node shape.
> I don't see why it's confusing to use the same sh:NodeShape for both
> targeting and its normal purpose (validation),
> and it's important for us to be able to reuse shapes in this way (see the
> last 2 examples).
>
> IMHO it should be something like sh:targetShape
>>
>
> I'd be fine with this (as soon as we stick with type sh:NodeShape) but
> don't see why it's needed:
> - my proposal: sh:target [a sh:NodeShape; ...]
> - your proposal: sh:targetShape [a sh:NodeShape; ...]
>
> sh:target is polymorphic by SHACL-AF definition, so I don't see why we
> need a specialized prop name.
>
> I remain very nervous about performance implications.
>
>
> That was also my concern because we're paying Havard to implement what we
> need for the Onto platform,
> which is a limited targeting (conjunction of disjunction of hasValue).
> But Havard assures us that he's already implemented more generic targeting
> (though still not full SHACL shapes! there's only atomic sh:path)
> and that it's efficient.
>
> Havard has answered with a lot more detail about performance.
>
> I'll add some warning that such targeting is potentially expensive, and
> users must be careful when using it, and check with their specific SHACL
> implementation.
>
>
>> "is node N in the target of S" requires iterating over all
>> sh:targetShapes each time. This can be very expensive.
>>
>
> Yes, that's also a concern and we'll give Havard sizable schemas (say 100
> shapes, and each node matches say 5-10 shapes, being the depth of the
> 'semantic type hierarchy").
>
> The implementation cost of this feature is significant, because it
>> requires the implementation of an "inverse validation" algorithm.
>> Validation starts with a focus node and returns a result.
>
>
> In rdf4j, validation starts with a transaction, assuming that data-at-rest
> is valid.
> I believe Havard can "index" all the targeting shapes, so it's efficient
> to check all of them over the set of nodes in the transaction.
>
> guess most of them are hard to execute in the inverse order:
>> sh:datatype, sh:nodeKind, sh:minExclusive etc, sh:minLength etc,
>> sh:pattern, sh:languageIn, sh:uniqueLang, sh:lessThan etc, sh:closed,
>
>
> You're right in many cases.
> Any user who selects nodes by strlen is shooting himself in the foot.
> So we better put in some warnings which constructs it's wise to use in a
> target shape, and which ones are stupid.
>
>
>> So what if we simply introduce a new target type sh:targetHasValue V
>> where the targets can be identified by a direct look-up. For example
>>
>> ex:KiwiShape
>>      sh:targetHasValue [
>>          sh:path ex:nationality ;
>>          sh:hasValue ex:NewZealand ;
>>
>
> We need somewhat more though:
>
> ex:PoliticianShape a sh:NodeShape;
>   sh:semanticTarget (
>     [sh:path rdf:type; valueIn (dbo:Person schema:Person)]
>     [sh:path dt:type; valueIn ("politician" "president")]
>   );
>
> That's what I started with, but then you guys said "filter shapes are very
> useful", so I wrote up the more general case.
>
>
>
Received on Thursday, 11 June 2020 08:05:19 UTC