Re: ISSUE-92: Should repeated properties be interpreted as additive or conjunctive?

I looked at how partition is defined in the abstract syntax. The problem is that the abstract syntax doesn’t reflect SHACL’s current design and semantics. Not sure if it ever was 100% accurate,  but right now it definitely is not. Because of this, it was not obvious to me how would one take what is in the abstract syntax document and put it into SHACL CORE. 

Thus, I believe that, unfortunately, my original statement that this is a piece of work someone still needs to do, holds true. And, unless we have someone volunteering to do so, we should remove sh:partition.

Where use cases that motivated sh:partition, can not be practically supported through QCRs and logical constraint components, they can be supported using SHACL SPARQL. Community can even define libraries of constraints for specific vocabularies such as FOAF that would specifically say that foaf:firstName and foaf:givenName should not be used together, etc. 

Needing to say “use either one property or another and not both” or “use one property or a combination of two other properties” are valid use cases for some users. However, there are also many valid use cases that require specifying other types of dependencies between values of properties. For example:
one may want to say that only if there is a value for ex:approvalDate, there must be a value for ex:approvedBy and vs. versa. In other words, they must always be used or not used together. In my experience, this type of constraint is requested more often than “use either one property or another and not both”.
or, in a slightly more complex scenario, only if the value of ex:status = “Approved”, then there must be both, a value for ex:approvalDate and a value for ex:approvedBy. 
Neither of these could be expressed with sh:partition. In the earlier iterations of SHACL they were supported in CORE using filters. There also was xOr that could have been used to support saying “use either one property or another and not both”. As SHACL was simplified, both of these features were dropped and, as a result, such constraints are also no longer expressible in SHACL CORE. They could still be done in SHACL SPARQL.

I wish we kept filters, for example, but we need to produce a “minimal viable product” on schedule with SHACL 1.0. To do so, the WG has been pairing down and simplifying the language and W3 management recommended that we continue to do so - while exercising best judgement.

Another option, if the WG felt that we must further improve support for the partition-type use cases in CORE, would be to add the flag as proposed by Holger. It seems to be fairly straightforward, low effort and standalone. If we get too many issues with it at CR-1, it could be easily dropped at CR-2.

I hope that once we publish a CR, some resources may become available for creating a non normative document with lots of examples on best practices for using SHACL SPARQL to support use cases that can’t be done with SHACL CORE.

Irene

> On Jan 13, 2017, at 4:12 AM, Eric Prud'hommeaux <eric@w3.org> wrote:
> 
> * Holger Knublauch <holger@topquadrant.com <mailto:holger@topquadrant.com>> [2017-01-13 08:36+1000]
>> On 13/01/2017 3:14, Eric Prud'hommeaux wrote:
>>> * Irene Polikoff <irene@topquadrant.com> [2017-01-12 12:05-0500]
>>>> Hi Eric,
>>>> 
>>>> I hope your proposal could be restored.
>> 
>> I had moved some documents into a sub-folder on github, and forgot to update
>> the reference. I have now updated the reference to the Abstract Syntax on
>> the main page:
>> 
>> https://www.w3.org/2014/data-shapes/wiki/Main_Page#Deliverables
>> 
>> The deep link should then be
>> 
>> http://w3c.github.io/data-shapes/unmaintained/shacl-abstract-syntax/#satisfies-PartitionConstraint
> 
> I would move it back as there's a TR/ doc pointing to it, or at least
> put in a forwarding document (unless github.io has a convention for
> redirects).
> 
> 
>>>> 
>>>> Since partitioning was not a goal, but a way to address these use cases and SHACL has QualifiedValueShape for the first use case and an OR for the second use case, do you think the use cases are addressed or is there still something missing?
>>> Partitioning was not a goal but was the only way we found of providing
>>> a satisfactory user experience. While qualified cardinality
>>> restrictions can capture top-level repeated properties, they become
>>> quite a burden to author and maintain. Likewise, exploding every OR
>>> with a negation of the DNF of the other disjuncts requires diligence
>>> and is tedious and error prone if the disjuncts have any complexity to
>>> them.
>> 
>> Let's assume we want to express that an old-fashioned Marriage consists of
>> one female and one male member, making sure no instance is counted twice
>> (i.e. male and female at once). With the current QCRs that would be
>> expressed using
>> 
>> ex:MarriageShape
>>    a sh:NodeShape ;
>>    sh:property [
>>        sh:predicate ex:member ;
>>        sh:qualifiedMinCount 1 ;
>>        sh:qualifiedMaxCount 1 ;
>>        sh:qualifiedValueShape [
>>            sh:shape ex:MaleShape ;
>>            sh:not ex:FemaleShape ;
>>        ] ;
>>    ] ;
>>    sh:property [
>>        sh:predicate ex:member ;
>>        sh:qualifiedMinCount 1 ;
>>        sh:qualifiedMaxCount 1 ;
>>        sh:qualifiedValueShape [
>>            sh:shape ex:FemaleShape ;
>>            sh:not ex:MaleShape ;
>>        ] ;
>>    ] .
>> 
>> In cases of complex expressions and many combinations this could indeed
>> become a bit complex to write (unless it's produced by an algorithm anyway
>> such as the converter from Compact Syntax to RDF).
>> 
>> However, it seems easy to add a flag to the shape to indicate whether it's
>> supposed to interpret the value shapes as being disjoint, e.g.
>> 
>> ex:MarriageShape
>>    a sh:NodeShape ;
>>    sh:qualifiedValueShapesDisjoint true ;
>>    sh:property [
>>        sh:predicate ex:member ;
>>        sh:qualifiedMinCount 1 ;
>>        sh:qualifiedMaxCount 1 ;
>>        sh:qualifiedValueShape sh:shape ex:MaleShape ;
>>    ] ;
>>    sh:property [
>>        sh:predicate ex:member ;
>>        sh:qualifiedMinCount 1 ;
>>        sh:qualifiedMaxCount 1 ;
>>        sh:qualifiedValueShape ex:FemaleShape ;
>>    ] .
>> 
>> Would such a flag help? We could delete sh:partition and would "just" need
>> to update the definition of how qualified value shapes are interpreted,
>> basically adding sh:not statements for all sibling QCRs on the same
>> predicate. I must be missing something obvious?
> 
> It helps with top-level repeated properties. Suppose our
> (hetero-normative) marriage database flags the .1% of the males born
> with an extra Y chromosome:
> 
>  <MarriageShape> {
>    ex:member @ex:XXFemaleShape ;
>    (  ex:marker [ex:XYY] ;
>       ex:member @ex:XYYMaleShape
>     | ex:member @ex:XYMaleShape )
>  }
> 
> The point of the partitioning strategy was to provide a consistent
> interface regardless of whether properties were mentioned one or more
> time, within blocks of arbitrary cardinality.
> 
> Point 2 below can be illustrated with:
> 
>  <PersonShape> {
>     foaf:name xsd:string
>   |
>    (foaf:givenName xsd:string;
>     foaf:familyName xsd:string) # ()s included for clarity
>  }
> 
> The goal of partitioning is to accept data that fulfills exactly one disjunct:
>  { <A> foaf:name "Bob Smith" }
> 
>  { <B> foaf:givenName "Bob";
>        foaf:familyName "Smith" }
> 
> and reject data that doesn't fit squarely into one or the other disjunct:
> 
>  { <C> foaf:name "Bob Smith";
>        foaf:givenName "Bob" }
> 
>  { <D> foaf:name "Bob Smith";
>        foaf:givenName "Bob";
>  foaf:familyName "Smith" }
> 
> 
>> Holger
>> 
>> 
>>> 
>>> 
>>>> Irene
>>>> 
>>>>> On Jan 12, 2017, at 6:00 AM, Eric Prud'hommeaux <eric@w3.org> wrote:
>>>>> 
>>>>> Originally, ShEx (a surface syntax for ResourceShape plus OR) didn't have partitions. It was added to:
>>>>> 
>>>>> 1 address the frequent (and encouraged) reuse of generic properties. These are extremely common in clinical data, for instance a BP observation has two components, one of which has a code for systolic and the other a code for diastolic.
>>>>> 
>>>>> 2 provide an OR that was closer to user expectations and reduced the need for defensive programming, e.g a shape expecting either a name or a given and family name should reject a mixture like { <s> foaf:name "X"; foaf:givenName "Y" }.
>>>>> 
>>>>> Partitioning was never a goal; it was a means of satisfying these needs.
>> 
>> 
> 
> -- 
> -ericP
> 
> office: +1.617.599.3509
> mobile: +33.6.80.80.35.59
> 
> (eric@w3.org <mailto:eric@w3.org>)
> Feel free to forward this message to any list for any purpose other than
> email address distribution.
> 
> There are subtle nuances encoded in font variation and clever layout
> which can only be seen by printing this message on high-clay paper.

Received on Tuesday, 17 January 2017 01:02:48 UTC