W3C home > Mailing lists > Public > public-data-shapes-wg@w3.org > January 2017

Re: ISSUE-92: Should repeated properties be interpreted as additive or conjunctive?

From: Eric Prud'hommeaux <eric@w3.org>
Date: Fri, 13 Jan 2017 04:12:10 -0500
To: Holger Knublauch <holger@topquadrant.com>
Cc: public-data-shapes-wg@w3.org
Message-ID: <20170113091208.GZ6394@w3.org>
* Holger Knublauch <holger@topquadrant.com> [2017-01-13 08:36+1000]
> On 13/01/2017 3:14, Eric Prud'hommeaux wrote:
> >* Irene Polikoff <irene@topquadrant.com> [2017-01-12 12:05-0500]
> >>Hi Eric,
> >>
> >>I hope your proposal could be restored.
> 
> I had moved some documents into a sub-folder on github, and forgot to update
> the reference. I have now updated the reference to the Abstract Syntax on
> the main page:
> 
> https://www.w3.org/2014/data-shapes/wiki/Main_Page#Deliverables
> 
> The deep link should then be
> 
> http://w3c.github.io/data-shapes/unmaintained/shacl-abstract-syntax/#satisfies-PartitionConstraint

I would move it back as there's a TR/ doc pointing to it, or at least
put in a forwarding document (unless github.io has a convention for
redirects).


> >>
> >>Since partitioning was not a goal, but a way to address these use cases and SHACL has QualifiedValueShape for the first use case and an OR for the second use case, do you think the use cases are addressed or is there still something missing?
> >Partitioning was not a goal but was the only way we found of providing
> >a satisfactory user experience. While qualified cardinality
> >restrictions can capture top-level repeated properties, they become
> >quite a burden to author and maintain. Likewise, exploding every OR
> >with a negation of the DNF of the other disjuncts requires diligence
> >and is tedious and error prone if the disjuncts have any complexity to
> >them.
> 
> Let's assume we want to express that an old-fashioned Marriage consists of
> one female and one male member, making sure no instance is counted twice
> (i.e. male and female at once). With the current QCRs that would be
> expressed using
> 
> ex:MarriageShape
>     a sh:NodeShape ;
>     sh:property [
>         sh:predicate ex:member ;
>         sh:qualifiedMinCount 1 ;
>         sh:qualifiedMaxCount 1 ;
>         sh:qualifiedValueShape [
>             sh:shape ex:MaleShape ;
>             sh:not ex:FemaleShape ;
>         ] ;
>     ] ;
>     sh:property [
>         sh:predicate ex:member ;
>         sh:qualifiedMinCount 1 ;
>         sh:qualifiedMaxCount 1 ;
>         sh:qualifiedValueShape [
>             sh:shape ex:FemaleShape ;
>             sh:not ex:MaleShape ;
>         ] ;
>     ] .
> 
> In cases of complex expressions and many combinations this could indeed
> become a bit complex to write (unless it's produced by an algorithm anyway
> such as the converter from Compact Syntax to RDF).
> 
> However, it seems easy to add a flag to the shape to indicate whether it's
> supposed to interpret the value shapes as being disjoint, e.g.
> 
> ex:MarriageShape
>     a sh:NodeShape ;
>     sh:qualifiedValueShapesDisjoint true ;
>     sh:property [
>         sh:predicate ex:member ;
>         sh:qualifiedMinCount 1 ;
>         sh:qualifiedMaxCount 1 ;
>         sh:qualifiedValueShape sh:shape ex:MaleShape ;
>     ] ;
>     sh:property [
>         sh:predicate ex:member ;
>         sh:qualifiedMinCount 1 ;
>         sh:qualifiedMaxCount 1 ;
>         sh:qualifiedValueShape ex:FemaleShape ;
>     ] .
> 
> Would such a flag help? We could delete sh:partition and would "just" need
> to update the definition of how qualified value shapes are interpreted,
> basically adding sh:not statements for all sibling QCRs on the same
> predicate. I must be missing something obvious?

It helps with top-level repeated properties. Suppose our
(hetero-normative) marriage database flags the .1% of the males born
with an extra Y chromosome:

  <MarriageShape> {
    ex:member @ex:XXFemaleShape ;
    (  ex:marker [ex:XYY] ;
       ex:member @ex:XYYMaleShape
     | ex:member @ex:XYMaleShape )
  }

The point of the partitioning strategy was to provide a consistent
interface regardless of whether properties were mentioned one or more
time, within blocks of arbitrary cardinality.

Point 2 below can be illustrated with:

  <PersonShape> {
     foaf:name xsd:string
   |
    (foaf:givenName xsd:string;
     foaf:familyName xsd:string) # ()s included for clarity
  }

The goal of partitioning is to accept data that fulfills exactly one disjunct:
  { <A> foaf:name "Bob Smith" }

  { <B> foaf:givenName "Bob";
        foaf:familyName "Smith" }

and reject data that doesn't fit squarely into one or the other disjunct:

  { <C> foaf:name "Bob Smith";
        foaf:givenName "Bob" }

  { <D> foaf:name "Bob Smith";
        foaf:givenName "Bob";
	foaf:familyName "Smith" }


> Holger
> 
> 
> >
> >
> >>Irene
> >>
> >>>On Jan 12, 2017, at 6:00 AM, Eric Prud'hommeaux <eric@w3.org> wrote:
> >>>
> >>>Originally, ShEx (a surface syntax for ResourceShape plus OR) didn't have partitions. It was added to:
> >>>
> >>>  1 address the frequent (and encouraged) reuse of generic properties. These are extremely common in clinical data, for instance a BP observation has two components, one of which has a code for systolic and the other a code for diastolic.
> >>>
> >>>  2 provide an OR that was closer to user expectations and reduced the need for defensive programming, e.g a shape expecting either a name or a given and family name should reject a mixture like { <s> foaf:name "X"; foaf:givenName "Y" }.
> >>>
> >>>Partitioning was never a goal; it was a means of satisfying these needs.
> 
> 

-- 
-ericP

office: +1.617.599.3509
mobile: +33.6.80.80.35.59

(eric@w3.org)
Feel free to forward this message to any list for any purpose other than
email address distribution.

There are subtle nuances encoded in font variation and clever layout
which can only be seen by printing this message on high-clay paper.
Received on Friday, 13 January 2017 09:12:20 UTC

This archive was generated by hypermail 2.3.1 : Friday, 13 January 2017 09:12:21 UTC