Re: Reopening the discussion on sh:targetShape from Irene Polikoff on 2020-07-06 (public-shacl@w3.org from July 2020)

From: Irene Polikoff <irene@topquadrant.com>
Date: Mon, 6 Jul 2020 07:49:28 -0400
To: public-shacl@w3.org
Cc: Håvard Ottestad <hmottestad@gmail.com>, holger@topquadrant.com
Message-Id: <64718687-0028-4895-A979-B10EF1738FBA@topquadrant.com>
But if there is no agreement, then I am concerned about using sh: namespace for this new construct. This does not seem right.

TQ has been adding some custom constructs into dash: namespace, for example. Other namespaces for custom extensions could be used, as well.

I believe sh: namespace should be reserved for things that a majority of implementers have reached consensus on. I recognize that currently we only have 2 implementers in this discussion. It would be better to broaden the circle of people looking at this topic.

If we have a strong disagreement, then the process to follow is normally:

1. Have a call (or calls) between the concerned parties to see if they can reach an agreement to come to some compromise.
2. If not and the objection is strong ( as in “will not implement”), then typically, the feature would not make it to the spec. Implementers can still do it as a custom extension. 

In the past, we had some hypothetical and/or philosophical objections that slowed progress by many many months. It was frustrating. An objection based on the implementation experience is, however, different. I believe it needs a serious consideration and a resolution - even if it slows progress. Unfortunately, there is no way to avoid some sluggishness when multiple parties need to come to consensus - this is a downside of standards development.

Irene

> On Jul 6, 2020, at 6:06 AM, Holger Knublauch <holger@topquadrant.com> wrote:
> 
>> On 6/07/2020 16:53, Håvard Ottestad wrote:
>> 
>> Hi Holger,
>> 
>> Could you share the shape that has particularly bad performance so I can see if I can think of an optimal solution?
> For examples 2,3,4 from https://github.com/w3c/shacl/pull/3
>> 
>> My plan has essentially been to convert the targetShape into a sparql query. This would put the performance in the same realm as sparql targets.
>> 
>> The benefits of targetShape over sparql targets is that it’s possible to validate the changes to a database efficiently, we are seeing O(c) performance where c is the effective size of the change instead of O(n) which is what we were seeing with sparql targets (where n is the size of the database).
> 
> The same algorithms can be applied to the dash:HasValueTarget - it's just as declarative as the other shapes.
> 
> I think we (I at least) had discussed this sufficiently. I think I was clear on the performance issues. I think we can agree to disagree and move on. I was hoping that other implementers may have additional input.
> 
> Holger
> 
> 
>> 
>> Håvard
>> 
>>> On 6 Jul 2020, at 03:02, Holger Knublauch <holger@topquadrant.com> wrote:
>>> 
>>> There have been various discussions around SHACL target extensions, and there is an open Pull Request https://github.com/w3c/shacl/pull/3 to add sh:targetShape as a new target type to SHACL-AF. I have meanwhile attempted to implement that feature in our code base and have concluded that the feature is not a good idea for SHACL-AF (or even SHACL Core). The main argument is still about performance:
>>> 
>>> - I stated that the worst-case performance of this general feature is *catastrophic* as it needs to perform validation on all subjects and objects only to determine which nodes it then needs to validate for real. This means that sh:targetShape is very different from the other 4 built-in target types (sh:targetClass, sh:targetNode, sh:targetSubjectsOf, sh:targetObjectsOf) in that it requires validation before validation (which by itself causes implementation complexity).
>>> 
>>> - Håvard stated that the alternative, SPARQL-based targets has bad performance for his implementation.
>>> 
>>> We do have similar use cases to yours, esp around dependencies across multiple properties. For example:
>>> 
>>> IF ex:country=USA THEN ex:state sh:in [ "AZ", "CA", "FL" ... ]
>>> IF ex:country=AU THEN ex:state sh:in [ "NSW", "VIC", "QLD" ... ]
>>> 
>>> We also want a declarative solution that can be used by input forms, so that if the user changes the country then the states drop down list also changes. So relying on SPARQL queries or so wouldn't solve our use cases either.
>>> 
>>> The current proposal, based on the new keyword sh:targetShape was
>>> 
>>> ex:USAStateShape
>>>     sh:targetShape [
>>>         a sh:PropertyShape ;
>>>         sh:path ex:country ;
>>>         sh:hasValue ex:USA ;
>>>     ] ;
>>>     sh:property [
>>>         sh:path sh:state ;
>>>         sh:in [ "AZ" "CA" "FL" ... ]
>>>     ] .
>>> 
>>> I believe the following is better overall:
>>> 
>>> ex:USAStateShape
>>>     sh:target [
>>>         a dash:HasValueTarget ;
>>>         dash:predicate ex:country ;
>>>         dash:object ex:USA ;
>>>     ] ;
>>>     sh:property [
>>>         sh:path sh:state ;
>>>         sh:in [ "AZ" "CA" "FL" ... ]
>>>     ] .
>>> 
>>> Where dash:HasValueTarget is a SPARQL-based Target Type https://w3c.github.io/shacl/shacl-af/#SPARQLTargetType
>>> 
>>> Implementations of SHACL-AF already will do the right thing and will be able to do so efficiently. If you cannot use SPARQL efficiently, your platform can simply hard-code this pattern, just like you currently hard-code the common scenarios of the proposed sh:targetShape property to avoid the bad default performance. I expect the difference in your implementation would be marginal, but we neither need to change the spec nor open up SHACL to a feature that is very complex to implement efficiently.
>>> 
>>> The downside of using something like dash:HasValueTarget is that it doesn't "cover" all possible use cases. Instead of allowing arbitrary sh:targetShapes we limit this to hasValue patterns. But those hasValue patterns were the main use cases before we brainstormed that "it would be nice" to also support various other shape types (sh:filterShape etc). hasValue patterns are trivial to look up. If anyone needs additional patterns, such as the one from the PR then they can be covered by custom targets which may also get hard-coded by those that cannot use SPARQL.
>>> 
>>> BTW the case of http://datashapes.org/constraints.html#HasValueInConstraintComponent can be covered by having multiple dash:HasValueTargets with different dash:objects. A bit more verbose but can reuse the same machinery. If you have long lists, introduce your own dash-like extension backed by a SPARQL query and hard-code against that if performance isn't good.
>>> 
>>> Sorry for moving back and forth on this topic, but getting hands-on experience with an implementation revealed to me just how bad the sh:targetShape solution would become. And I couldn't schedule time for such an implementation earlier due to other commitments.
>>> 
>>> It would be useful to have input from other SHACL implementers (there are about a dozen SHACL engines out there, and counting). We really don't want to rush something through which then becomes a burden for others.
>>> 
>>> Holger
>>> 
>>> 
>>> 
>
Received on Monday, 6 July 2020 11:49:44 UTC