Reopening the discussion on sh:targetShape from Holger Knublauch on 2020-07-06 (public-shacl@w3.org from July 2020)

From: Holger Knublauch <holger@topquadrant.com>
Date: Mon, 6 Jul 2020 11:01:47 +1000
To: Public Shacl W3C <public-shacl@w3.org>
Message-ID: <2d5454f1-74be-a0e3-46e4-bc50ad372cae@topquadrant.com>
There have been various discussions around SHACL target extensions, and 
there is an open Pull Request https://github.com/w3c/shacl/pull/3 to add 
sh:targetShape as a new target type to SHACL-AF. I have meanwhile 
attempted to implement that feature in our code base and have concluded 
that the feature is not a good idea for SHACL-AF (or even SHACL Core). 
The main argument is still about performance:

- I stated that the worst-case performance of this general feature is 
*catastrophic* as it needs to perform validation on all subjects and 
objects only to determine which nodes it then needs to validate for 
real. This means that sh:targetShape is very different from the other 4 
built-in target types (sh:targetClass, sh:targetNode, 
sh:targetSubjectsOf, sh:targetObjectsOf) in that it requires validation 
before validation (which by itself causes implementation complexity).

- Håvard stated that the alternative, SPARQL-based targets has bad 
performance for his implementation.

We do have similar use cases to yours, esp around dependencies across 
multiple properties. For example:

IF ex:country=USA THEN ex:state sh:in [ "AZ", "CA", "FL" ... ]
IF ex:country=AU THEN ex:state sh:in [ "NSW", "VIC", "QLD" ... ]

We also want a declarative solution that can be used by input forms, so 
that if the user changes the country then the states drop down list also 
changes. So relying on SPARQL queries or so wouldn't solve our use cases 
either.

The current proposal, based on the new keyword sh:targetShape was

ex:USAStateShape
     sh:targetShape [
         a sh:PropertyShape ;
         sh:path ex:country ;
         sh:hasValue ex:USA ;
     ] ;
     sh:property [
         sh:path sh:state ;
         sh:in [ "AZ" "CA" "FL" ... ]
     ] .

I believe the following is better overall:

ex:USAStateShape
     sh:target [
         a dash:HasValueTarget ;
         dash:predicate ex:country ;
         dash:object ex:USA ;
     ] ;
     sh:property [
         sh:path sh:state ;
         sh:in [ "AZ" "CA" "FL" ... ]
     ] .

Where dash:HasValueTarget is a SPARQL-based Target Type 
https://w3c.github.io/shacl/shacl-af/#SPARQLTargetType

Implementations of SHACL-AF already will do the right thing and will be 
able to do so efficiently. If you cannot use SPARQL efficiently, your 
platform can simply hard-code this pattern, just like you currently 
hard-code the common scenarios of the proposed sh:targetShape property 
to avoid the bad default performance. I expect the difference in your 
implementation would be marginal, but we neither need to change the spec 
nor open up SHACL to a feature that is very complex to implement 
efficiently.

The downside of using something like dash:HasValueTarget is that it 
doesn't "cover" all possible use cases. Instead of allowing arbitrary 
sh:targetShapes we limit this to hasValue patterns. But those hasValue 
patterns were the main use cases before we brainstormed that "it would 
be nice" to also support various other shape types (sh:filterShape etc). 
hasValue patterns are trivial to look up. If anyone needs additional 
patterns, such as the one from the PR then they can be covered by custom 
targets which may also get hard-coded by those that cannot use SPARQL.

BTW the case of 
http://datashapes.org/constraints.html#HasValueInConstraintComponent can 
be covered by having multiple dash:HasValueTargets with different 
dash:objects. A bit more verbose but can reuse the same machinery. If 
you have long lists, introduce your own dash-like extension backed by a 
SPARQL query and hard-code against that if performance isn't good.

Sorry for moving back and forth on this topic, but getting hands-on 
experience with an implementation revealed to me just how bad the 
sh:targetShape solution would become. And I couldn't schedule time for 
such an implementation earlier due to other commitments.

It would be useful to have input from other SHACL implementers (there 
are about a dozen SHACL engines out there, and counting). We really 
don't want to rush something through which then becomes a burden for others.

Holger
Received on Monday, 6 July 2020 01:02:07 UTC