W3C home > Mailing lists > Public > public-data-shapes-wg@w3.org > August 2016

Re: an alternative proposal for partition

From: Karen Coyle <kcoyle@kcoyle.net>
Date: Fri, 12 Aug 2016 15:14:58 -0700
To: public-data-shapes-wg@w3.org
Message-ID: <bc56d04f-648b-ca4a-d82a-face3dbea84c@kcoyle.net>


On 8/11/16 11:05 PM, Holger Knublauch wrote:
> I think this will come down to a general design choice. Do we want to
> add this very complex feature to the SHACL Core, or do we simply point
> people at the SPARQL extension mechanism. QCRs cover quite a number of
> use cases. Not every use case will be nicely expressible this way, but
> then OTOH many people already know SPARQL and use it every day for these
> very kinds of complex queries. Why do we need to reinvent everything
> into a higher-level language,

For the people who do not know SPARQL at that level. - kc

  esp given that - without doubt - someone
> else will request yet another design pattern that is not covered by our
> core language even with the most general sh:partition feature. We need
> to stop somewhere.
>
> In SPARQL, the check for your scenario could be something like (untested):
>
> SELECT ?this
> WHERE {
>     FILTER NOT EXISTS {
>         ?this dc:creator ?mil .
>         FILTER regex(?mil, "^mailto:.*@b.mil") .
>         ?this dc:creator ?gov .
>         FILTER (regex(?gov, "^mailto:.*@a.gov") ||
>                 EXISTS { ?gov foaf:mbox ?mbox . FILTER regex(?mbox,
> "^mailto:.*@a.gov") })
>     }
> }
>
> and we don't need to reinvent further wheels. You can even combine these
> with existing SHACL shapes, using sh:hasShape. Or you can make this
> nicer with SHACL functions, e.g.
>
> SELECT ?this
> WHERE {
>     FILTER NOT EXISTS {
>         ?this dc:creator ?mil .
>         FILTER ex:isMilEmail(?mil) .
>         ?this dc:creator ?gov .
>         FILTER (ex:isGovEmail(?gov) || EXISTS { ?gov foaf:mbox ?mbox .
> FILTER ex:isGovEmail(?mbox) })
>     }
> }
>
> which is IMHO quite an acceptable Compact Syntax, only far more general.
>
> Holger
>
>
> On 12/08/2016 11:01, Eric Prud'hommeaux wrote:
>> * Holger Knublauch <holger@topquadrant.com> [2016-08-11 17:14+1000]
>>> This looks like quite a mega feature, if sh:and and sh:or become
>>> overloaded
>>> with very different meaning, requiring a new execution algorithm etc.
>>> What
>>> about spawning this off into an extension, just like the SPARQL stuff
>>> is in
>>> an extension?
>>>
>>> Another option is to handle this on the Compact Syntax level, and
>>> produce
>>> QCRs under the hood. Are there any scenarios where QCRs could not (in
>>> principle) express your use cases?
>> The QCRs are relatively simple to generate but the universal
>> constraint is problematic. Taking the 2nd example below with a
>>
>> shexc:
>>    <S> {
>>      (   dc:creator PATTERN "^mailto:.*@a.gov"  # either creator a.gov
>>        | dc:creator {                           # or a creator some node
>>            foaf:mbox PATTERN "^mailto:.*@a.gov" # with a foaf:mbox of
>> a.gov
>>          }
>>      ) ;
>>      dc:creator PATTERN "^mailto:.*@b.mil"     # and one b.mil creator
>>    }
>>
>> dc:creator which may be an email or a bnode with a foaf:mbox, we can
>> compose and additional universal constraint which limits the objects
>> of dc:creator to the three enumerated forms:
>>
>> shacl:
>>    <S>
>>      sh:and (
>>        [ sh:or (
>>    #   dc:creator PATTERN "^mailto:.*@a.gov"  # either creator a.gov
>>          [ sh:property
>>            [ sh:predicate dc:creator ; sh:pattern "^mailto:.*@a.gov" ;
>>              sh:qualfiedMinCount 1; sh:qualfiedMaxCount 1
>>     ] ]
>>    # | dc:creator {                           # or a creator some node
>>    #     foaf:mbox PATTERN "^mailto:.*@a.gov" # with a foaf:mbox of a.gov
>>    #   }
>>          [ sh:property
>>            [ sh:predicate dc:creator ; sh:shape [
>>           sh:property
>>             [ sh:predicate foaf:mbox ; sh:pattern "^mailto:.*@a.gov" ;
>>                    sh:minCount 1; sh:maxCount 1
>>         ] ] ; sh:qualfiedMinCount 1; sh:qualfiedMaxCount 1
>>     ] ]
>>        ) ]
>>    #   dc:creator PATTERN "^mailto:.*@b.mil"     # and one b.mil creator
>>        [ sh:property
>>          [ sh:predicate dc:creator ; sh:pattern "^mailto:.*@b.mil" ;
>>            sh:qualfiedMinCount 1; sh:qualfiedMaxCount 1
>>        ] ]
>>    # universal constraint to handle closure
>>        [ sh:or (
>>          [ sh:property
>>            [ sh:predicate dc:creator ; sh:pattern "^mailto:.*@a.gov" ;
>>              sh:qualfiedMinCount 1; sh:qualfiedMaxCount 1
>>     ] ]
>>          [ sh:property
>>            [ sh:predicate dc:creator ; sh:shape [
>>           sh:property
>>             [ sh:predicate foaf:mbox ; sh:pattern "^mailto:.*@a.gov" ;
>>                    sh:minCount 1; sh:maxCount 1
>>         ] ] ; sh:qualfiedMinCount 1; sh:qualfiedMaxCount 1
>>     ] ]
>>          [ sh:property
>>            [ sh:predicate dc:creator ; sh:pattern "^mailto:.*@b.mil" ;
>>              sh:qualfiedMinCount 1; sh:qualfiedMaxCount 1
>>          ] ]
>>        ) ]
>>      ) .
>>
>> This gets us part way there, but round-tripping may be impossible and
>> it doesn't provide the OneOf OR behavior has described in the
>> foaf:name example below. A typical example from clinical data (FHIR)
>> analog to the (name|givenName,familyName) example would involve making
>> sure that an osteoporotic spiral fracture had either a single
>> component with a compound code ("46675001|73737008") or two
>> components, each with one of those codes. For instance, ShEx's
>> partitioning semantics allows us to easily capture this error:
>>
>>    <Obs1> a fhir:Observation ;
>>      fhir:component
>>        [ fhir:code "46675001|73737008" ],
>>        [ fhir:code "46675001" ].
>>
>>
>>> Holger
>>>
>>>
>>> On 11/08/2016 11:32, Eric Prud'hommeaux wrote:
>>>> The current partition meets some additive use cases like:
>>>>    <S> {
>>>>      dc:creator PATTERN "^mailto:a.gov" ; # one a.gov creator
>>>>      dc:creator PATTERN "^mailto:b.mil"   # and one b.mil creator
>>>>    }
>>>>
>>>> but not ones with any algebraic operators like:
>>>>    <S> {
>>>>      (   dc:creator PATTERN "^mailto:.*@a.gov"  # either creator a.gov
>>>>        | dc:creator {                           # or a creator some
>>>> node
>>>>            foaf:mbox PATTERN "^mailto:.*@a.gov" # with a foaf:mbox
>>>> of a.gov
>>>>          }
>>>>      ) ;
>>>>      dc:creator PATTERN "^mailto:.*@b.mil"     # and one b.mil creator
>>>>    }
>>>>
>>>> An alternative which would be to create a syntax to capture ShEx's
>>>> partition semantics which say:
>>>>    Map the triples to the triple patterns with the same predicate.
>>>>    The node is valid with respect to a triple expression if there
>>>>    is a mapping of triple to triple pattern which satisfies the
>>>>    expression.
>>>> For instance, the data
>>>>    <s> dc:creator <mailto:a@b.mil> .
>>>>    <s> dc:creator _:b1 .
>>>>    _:b1 foaf:mbox <mailto:b@a.gov> .
>>>> satisfies the above pattern.
>>>>
>>>> I propose leveraging the current partition but allowing expressions:
>>>>    <S> sh:partition [
>>>>      sh:and (
>>>>        [ sh:property [
>>>>            sh:predicate ex:creator ; sh:minCount 1 ; sh:maxCount 1 ;
>>>>       sh:pattern "^mailto:.*@a.gov" ] ]
>>>>        [ sh:property [
>>>>            sh:predicate ex:creator ; sh:minCount 1 ; sh:maxCount 1 ;
>>>>       sh:pattern "^mailto:.*@b.mil" ] ]
>>>>      ) .
>>>>
>>>> This also handily provides a semantics with a disjunctive OR so e.g.
>>>>    <EmployeeShape> {
>>>>        foaf:name .          # either a foaf:name
>>>>      | ( foaf:givenName . ; # or a pair of givenName
>>>>          foaf:familyName .  # and familyName
>>>>        )
>>>>    }
>>>> would not be satisfied with a partial pair:
>>>>    <emp1> foaf:name "Alice Cooper" .
>>>>    <emp1> foaf:familyName "Cooper" .
>>>> because the 1st disjoint doesn't use the 2nd triple and the 2nd
>>>> disjoint is missing a familyName.
>>>>
>>>> Users wanting either additive properties or disjunctive OR could
>>>> use the sh:partition operator.
>>>>
>>>
>
>
>

-- 
Karen Coyle
kcoyle@kcoyle.net http://kcoyle.net
m: 1-510-435-8234
skype: kcoylenet/+1-510-984-3600
Received on Friday, 12 August 2016 22:15:28 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 19:30:36 UTC