W3C home > Mailing lists > Public > public-data-shapes-wg@w3.org > August 2016

Re: an alternative proposal for partition

From: Karen Coyle <kcoyle@kcoyle.net>
Date: Fri, 12 Aug 2016 16:08:37 -0700
To: Martynas Jusevičius <martynas@atomgraph.com>
Cc: RDF Data Shapes Working Group <public-data-shapes-wg@w3.org>
Message-ID: <3b782c57-b930-0eb3-487f-ae543536986a@kcoyle.net>
Martynas, being able to create extensions to SHACL is not going to be in 
the skill set of someone who only knows simple queries - did you miss 
the "at that level" part of my response? And, actually, some people are 
not directly using SPARQL but have API-like mechanisms that allow them 
to select RDF graphs.

We can't build SHACL on the assumption that everyone has deep SPARQL 
skills, because that is not reality.

kc


On 8/12/16 3:55 PM, Martynas Jusevičius wrote:
> How come you expect people to validate RDF without being able to query it?
>
> On Sat, Aug 13, 2016 at 12:14 AM, Karen Coyle <kcoyle@kcoyle.net> wrote:
>>
>>
>> On 8/11/16 11:05 PM, Holger Knublauch wrote:
>>>
>>> I think this will come down to a general design choice. Do we want to
>>> add this very complex feature to the SHACL Core, or do we simply point
>>> people at the SPARQL extension mechanism. QCRs cover quite a number of
>>> use cases. Not every use case will be nicely expressible this way, but
>>> then OTOH many people already know SPARQL and use it every day for these
>>> very kinds of complex queries. Why do we need to reinvent everything
>>> into a higher-level language,
>>
>>
>> For the people who do not know SPARQL at that level. - kc
>>
>>
>>  esp given that - without doubt - someone
>>>
>>> else will request yet another design pattern that is not covered by our
>>> core language even with the most general sh:partition feature. We need
>>> to stop somewhere.
>>>
>>> In SPARQL, the check for your scenario could be something like (untested):
>>>
>>> SELECT ?this
>>> WHERE {
>>>     FILTER NOT EXISTS {
>>>         ?this dc:creator ?mil .
>>>         FILTER regex(?mil, "^mailto:.*@b.mil") .
>>>         ?this dc:creator ?gov .
>>>         FILTER (regex(?gov, "^mailto:.*@a.gov") ||
>>>                 EXISTS { ?gov foaf:mbox ?mbox . FILTER regex(?mbox,
>>> "^mailto:.*@a.gov") })
>>>     }
>>> }
>>>
>>> and we don't need to reinvent further wheels. You can even combine these
>>> with existing SHACL shapes, using sh:hasShape. Or you can make this
>>> nicer with SHACL functions, e.g.
>>>
>>> SELECT ?this
>>> WHERE {
>>>     FILTER NOT EXISTS {
>>>         ?this dc:creator ?mil .
>>>         FILTER ex:isMilEmail(?mil) .
>>>         ?this dc:creator ?gov .
>>>         FILTER (ex:isGovEmail(?gov) || EXISTS { ?gov foaf:mbox ?mbox .
>>> FILTER ex:isGovEmail(?mbox) })
>>>     }
>>> }
>>>
>>> which is IMHO quite an acceptable Compact Syntax, only far more general.
>>>
>>> Holger
>>>
>>>
>>> On 12/08/2016 11:01, Eric Prud'hommeaux wrote:
>>>>
>>>> * Holger Knublauch <holger@topquadrant.com> [2016-08-11 17:14+1000]
>>>>>
>>>>> This looks like quite a mega feature, if sh:and and sh:or become
>>>>> overloaded
>>>>> with very different meaning, requiring a new execution algorithm etc.
>>>>> What
>>>>> about spawning this off into an extension, just like the SPARQL stuff
>>>>> is in
>>>>> an extension?
>>>>>
>>>>> Another option is to handle this on the Compact Syntax level, and
>>>>> produce
>>>>> QCRs under the hood. Are there any scenarios where QCRs could not (in
>>>>> principle) express your use cases?
>>>>
>>>> The QCRs are relatively simple to generate but the universal
>>>> constraint is problematic. Taking the 2nd example below with a
>>>>
>>>> shexc:
>>>>    <S> {
>>>>      (   dc:creator PATTERN "^mailto:.*@a.gov"  # either creator a.gov
>>>>        | dc:creator {                           # or a creator some node
>>>>            foaf:mbox PATTERN "^mailto:.*@a.gov" # with a foaf:mbox of
>>>> a.gov
>>>>          }
>>>>      ) ;
>>>>      dc:creator PATTERN "^mailto:.*@b.mil"     # and one b.mil creator
>>>>    }
>>>>
>>>> dc:creator which may be an email or a bnode with a foaf:mbox, we can
>>>> compose and additional universal constraint which limits the objects
>>>> of dc:creator to the three enumerated forms:
>>>>
>>>> shacl:
>>>>    <S>
>>>>      sh:and (
>>>>        [ sh:or (
>>>>    #   dc:creator PATTERN "^mailto:.*@a.gov"  # either creator a.gov
>>>>          [ sh:property
>>>>            [ sh:predicate dc:creator ; sh:pattern "^mailto:.*@a.gov" ;
>>>>              sh:qualfiedMinCount 1; sh:qualfiedMaxCount 1
>>>>     ] ]
>>>>    # | dc:creator {                           # or a creator some node
>>>>    #     foaf:mbox PATTERN "^mailto:.*@a.gov" # with a foaf:mbox of a.gov
>>>>    #   }
>>>>          [ sh:property
>>>>            [ sh:predicate dc:creator ; sh:shape [
>>>>           sh:property
>>>>             [ sh:predicate foaf:mbox ; sh:pattern "^mailto:.*@a.gov" ;
>>>>                    sh:minCount 1; sh:maxCount 1
>>>>         ] ] ; sh:qualfiedMinCount 1; sh:qualfiedMaxCount 1
>>>>     ] ]
>>>>        ) ]
>>>>    #   dc:creator PATTERN "^mailto:.*@b.mil"     # and one b.mil creator
>>>>        [ sh:property
>>>>          [ sh:predicate dc:creator ; sh:pattern "^mailto:.*@b.mil" ;
>>>>            sh:qualfiedMinCount 1; sh:qualfiedMaxCount 1
>>>>        ] ]
>>>>    # universal constraint to handle closure
>>>>        [ sh:or (
>>>>          [ sh:property
>>>>            [ sh:predicate dc:creator ; sh:pattern "^mailto:.*@a.gov" ;
>>>>              sh:qualfiedMinCount 1; sh:qualfiedMaxCount 1
>>>>     ] ]
>>>>          [ sh:property
>>>>            [ sh:predicate dc:creator ; sh:shape [
>>>>           sh:property
>>>>             [ sh:predicate foaf:mbox ; sh:pattern "^mailto:.*@a.gov" ;
>>>>                    sh:minCount 1; sh:maxCount 1
>>>>         ] ] ; sh:qualfiedMinCount 1; sh:qualfiedMaxCount 1
>>>>     ] ]
>>>>          [ sh:property
>>>>            [ sh:predicate dc:creator ; sh:pattern "^mailto:.*@b.mil" ;
>>>>              sh:qualfiedMinCount 1; sh:qualfiedMaxCount 1
>>>>          ] ]
>>>>        ) ]
>>>>      ) .
>>>>
>>>> This gets us part way there, but round-tripping may be impossible and
>>>> it doesn't provide the OneOf OR behavior has described in the
>>>> foaf:name example below. A typical example from clinical data (FHIR)
>>>> analog to the (name|givenName,familyName) example would involve making
>>>> sure that an osteoporotic spiral fracture had either a single
>>>> component with a compound code ("46675001|73737008") or two
>>>> components, each with one of those codes. For instance, ShEx's
>>>> partitioning semantics allows us to easily capture this error:
>>>>
>>>>    <Obs1> a fhir:Observation ;
>>>>      fhir:component
>>>>        [ fhir:code "46675001|73737008" ],
>>>>        [ fhir:code "46675001" ].
>>>>
>>>>
>>>>> Holger
>>>>>
>>>>>
>>>>> On 11/08/2016 11:32, Eric Prud'hommeaux wrote:
>>>>>>
>>>>>> The current partition meets some additive use cases like:
>>>>>>    <S> {
>>>>>>      dc:creator PATTERN "^mailto:a.gov" ; # one a.gov creator
>>>>>>      dc:creator PATTERN "^mailto:b.mil"   # and one b.mil creator
>>>>>>    }
>>>>>>
>>>>>> but not ones with any algebraic operators like:
>>>>>>    <S> {
>>>>>>      (   dc:creator PATTERN "^mailto:.*@a.gov"  # either creator a.gov
>>>>>>        | dc:creator {                           # or a creator some
>>>>>> node
>>>>>>            foaf:mbox PATTERN "^mailto:.*@a.gov" # with a foaf:mbox
>>>>>> of a.gov
>>>>>>          }
>>>>>>      ) ;
>>>>>>      dc:creator PATTERN "^mailto:.*@b.mil"     # and one b.mil creator
>>>>>>    }
>>>>>>
>>>>>> An alternative which would be to create a syntax to capture ShEx's
>>>>>> partition semantics which say:
>>>>>>    Map the triples to the triple patterns with the same predicate.
>>>>>>    The node is valid with respect to a triple expression if there
>>>>>>    is a mapping of triple to triple pattern which satisfies the
>>>>>>    expression.
>>>>>> For instance, the data
>>>>>>    <s> dc:creator <mailto:a@b.mil> .
>>>>>>    <s> dc:creator _:b1 .
>>>>>>    _:b1 foaf:mbox <mailto:b@a.gov> .
>>>>>> satisfies the above pattern.
>>>>>>
>>>>>> I propose leveraging the current partition but allowing expressions:
>>>>>>    <S> sh:partition [
>>>>>>      sh:and (
>>>>>>        [ sh:property [
>>>>>>            sh:predicate ex:creator ; sh:minCount 1 ; sh:maxCount 1 ;
>>>>>>       sh:pattern "^mailto:.*@a.gov" ] ]
>>>>>>        [ sh:property [
>>>>>>            sh:predicate ex:creator ; sh:minCount 1 ; sh:maxCount 1 ;
>>>>>>       sh:pattern "^mailto:.*@b.mil" ] ]
>>>>>>      ) .
>>>>>>
>>>>>> This also handily provides a semantics with a disjunctive OR so e.g.
>>>>>>    <EmployeeShape> {
>>>>>>        foaf:name .          # either a foaf:name
>>>>>>      | ( foaf:givenName . ; # or a pair of givenName
>>>>>>          foaf:familyName .  # and familyName
>>>>>>        )
>>>>>>    }
>>>>>> would not be satisfied with a partial pair:
>>>>>>    <emp1> foaf:name "Alice Cooper" .
>>>>>>    <emp1> foaf:familyName "Cooper" .
>>>>>> because the 1st disjoint doesn't use the 2nd triple and the 2nd
>>>>>> disjoint is missing a familyName.
>>>>>>
>>>>>> Users wanting either additive properties or disjunctive OR could
>>>>>> use the sh:partition operator.
>>>>>>
>>>>>
>>>
>>>
>>>
>>
>> --
>> Karen Coyle
>> kcoyle@kcoyle.net http://kcoyle.net
>> m: 1-510-435-8234
>> skype: kcoylenet/+1-510-984-3600
>>
>

-- 
Karen Coyle
kcoyle@kcoyle.net http://kcoyle.net
m: 1-510-435-8234
skype: kcoylenet/+1-510-984-3600
Received on Friday, 12 August 2016 23:09:08 UTC

This archive was generated by hypermail 2.3.1 : Friday, 12 August 2016 23:09:08 UTC