- From: Dimitris Kontokostas <kontokostas@informatik.uni-leipzig.de>
- Date: Fri, 12 Aug 2016 16:19:30 +0300
- To: Holger Knublauch <holger@topquadrant.com>
- Cc: public-data-shapes-wg <public-data-shapes-wg@w3.org>
- Message-ID: <CA+u4+a1K8=jK_MdhpmPmpR6fUYQCWMfSLe5wwMzRONvVmAagww@mail.gmail.com>
Hi Eric,
If I remember correctly we initially had sh:xor but we ended up dropping it
because XOR with more than two arguments was not so intuitive [1] [2].
Assuming we had an sh:onlyOneOf construct in SHACL, would your example get
simplified?
Dimitris
[1] http://mathworld.wolfram.com/XOR.html
[2] https://www.w3.org/2014/data-shapes/track/issues/85
On Fri, Aug 12, 2016 at 9:05 AM, Holger Knublauch <holger@topquadrant.com>
wrote:
> I think this will come down to a general design choice. Do we want to add
> this very complex feature to the SHACL Core, or do we simply point people
> at the SPARQL extension mechanism. QCRs cover quite a number of use cases.
> Not every use case will be nicely expressible this way, but then OTOH many
> people already know SPARQL and use it every day for these very kinds of
> complex queries. Why do we need to reinvent everything into a higher-level
> language, esp given that - without doubt - someone else will request yet
> another design pattern that is not covered by our core language even with
> the most general sh:partition feature. We need to stop somewhere.
>
> In SPARQL, the check for your scenario could be something like (untested):
>
> SELECT ?this
> WHERE {
> FILTER NOT EXISTS {
> ?this dc:creator ?mil .
> FILTER regex(?mil, "^mailto:.*@b.mil") .
> ?this dc:creator ?gov .
> FILTER (regex(?gov, "^mailto:.*@a.gov") ||
> EXISTS { ?gov foaf:mbox ?mbox . FILTER regex(?mbox,
> "^mailto:.*@a.gov") })
> }
> }
>
> and we don't need to reinvent further wheels. You can even combine these
> with existing SHACL shapes, using sh:hasShape. Or you can make this nicer
> with SHACL functions, e.g.
>
> SELECT ?this
> WHERE {
> FILTER NOT EXISTS {
> ?this dc:creator ?mil .
> FILTER ex:isMilEmail(?mil) .
> ?this dc:creator ?gov .
> FILTER (ex:isGovEmail(?gov) || EXISTS { ?gov foaf:mbox ?mbox .
> FILTER ex:isGovEmail(?mbox) })
> }
> }
>
> which is IMHO quite an acceptable Compact Syntax, only far more general.
>
> Holger
>
>
>
> On 12/08/2016 11:01, Eric Prud'hommeaux wrote:
>
>> * Holger Knublauch <holger@topquadrant.com> [2016-08-11 17:14+1000]
>>
>>> This looks like quite a mega feature, if sh:and and sh:or become
>>> overloaded
>>> with very different meaning, requiring a new execution algorithm etc.
>>> What
>>> about spawning this off into an extension, just like the SPARQL stuff is
>>> in
>>> an extension?
>>>
>>> Another option is to handle this on the Compact Syntax level, and produce
>>> QCRs under the hood. Are there any scenarios where QCRs could not (in
>>> principle) express your use cases?
>>>
>> The QCRs are relatively simple to generate but the universal
>> constraint is problematic. Taking the 2nd example below with a
>>
>> shexc:
>> <S> {
>> ( dc:creator PATTERN "^mailto:.*@a.gov" # either creator a.gov
>> | dc:creator { # or a creator some node
>> foaf:mbox PATTERN "^mailto:.*@a.gov" # with a foaf:mbox of
>> a.gov
>> }
>> ) ;
>> dc:creator PATTERN "^mailto:.*@b.mil" # and one b.mil creator
>> }
>>
>> dc:creator which may be an email or a bnode with a foaf:mbox, we can
>> compose and additional universal constraint which limits the objects
>> of dc:creator to the three enumerated forms:
>>
>> shacl:
>> <S>
>> sh:and (
>> [ sh:or (
>> # dc:creator PATTERN "^mailto:.*@a.gov" # either creator a.gov
>> [ sh:property
>> [ sh:predicate dc:creator ; sh:pattern "^mailto:.*@a.gov" ;
>> sh:qualfiedMinCount 1; sh:qualfiedMaxCount 1
>> ] ]
>> # | dc:creator { # or a creator some node
>> # foaf:mbox PATTERN "^mailto:.*@a.gov" # with a foaf:mbox of a.gov
>> # }
>> [ sh:property
>> [ sh:predicate dc:creator ; sh:shape [
>> sh:property
>> [ sh:predicate foaf:mbox ; sh:pattern "^mailto:.*@a.gov"
>> ;
>> sh:minCount 1; sh:maxCount 1
>> ] ] ; sh:qualfiedMinCount 1; sh:qualfiedMaxCount 1
>> ] ]
>> ) ]
>> # dc:creator PATTERN "^mailto:.*@b.mil" # and one b.mil creator
>> [ sh:property
>> [ sh:predicate dc:creator ; sh:pattern "^mailto:.*@b.mil" ;
>> sh:qualfiedMinCount 1; sh:qualfiedMaxCount 1
>> ] ]
>> # universal constraint to handle closure
>> [ sh:or (
>> [ sh:property
>> [ sh:predicate dc:creator ; sh:pattern "^mailto:.*@a.gov" ;
>> sh:qualfiedMinCount 1; sh:qualfiedMaxCount 1
>> ] ]
>> [ sh:property
>> [ sh:predicate dc:creator ; sh:shape [
>> sh:property
>> [ sh:predicate foaf:mbox ; sh:pattern "^mailto:.*@a.gov"
>> ;
>> sh:minCount 1; sh:maxCount 1
>> ] ] ; sh:qualfiedMinCount 1; sh:qualfiedMaxCount 1
>> ] ]
>> [ sh:property
>> [ sh:predicate dc:creator ; sh:pattern "^mailto:.*@b.mil" ;
>> sh:qualfiedMinCount 1; sh:qualfiedMaxCount 1
>> ] ]
>> ) ]
>> ) .
>>
>> This gets us part way there, but round-tripping may be impossible and
>> it doesn't provide the OneOf OR behavior has described in the
>> foaf:name example below. A typical example from clinical data (FHIR)
>> analog to the (name|givenName,familyName) example would involve making
>> sure that an osteoporotic spiral fracture had either a single
>> component with a compound code ("46675001|73737008") or two
>> components, each with one of those codes. For instance, ShEx's
>> partitioning semantics allows us to easily capture this error:
>>
>> <Obs1> a fhir:Observation ;
>> fhir:component
>> [ fhir:code "46675001|73737008" ],
>> [ fhir:code "46675001" ].
>>
>>
>> Holger
>>>
>>>
>>> On 11/08/2016 11:32, Eric Prud'hommeaux wrote:
>>>
>>>> The current partition meets some additive use cases like:
>>>> <S> {
>>>> dc:creator PATTERN "^mailto:a.gov" ; # one a.gov creator
>>>> dc:creator PATTERN "^mailto:b.mil" # and one b.mil creator
>>>> }
>>>>
>>>> but not ones with any algebraic operators like:
>>>> <S> {
>>>> ( dc:creator PATTERN "^mailto:.*@a.gov" # either creator a.gov
>>>> | dc:creator { # or a creator some node
>>>> foaf:mbox PATTERN "^mailto:.*@a.gov" # with a foaf:mbox of
>>>> a.gov
>>>> }
>>>> ) ;
>>>> dc:creator PATTERN "^mailto:.*@b.mil" # and one b.mil creator
>>>> }
>>>>
>>>> An alternative which would be to create a syntax to capture ShEx's
>>>> partition semantics which say:
>>>> Map the triples to the triple patterns with the same predicate.
>>>> The node is valid with respect to a triple expression if there
>>>> is a mapping of triple to triple pattern which satisfies the
>>>> expression.
>>>> For instance, the data
>>>> <s> dc:creator <mailto:a@b.mil> .
>>>> <s> dc:creator _:b1 .
>>>> _:b1 foaf:mbox <mailto:b@a.gov> .
>>>> satisfies the above pattern.
>>>>
>>>> I propose leveraging the current partition but allowing expressions:
>>>> <S> sh:partition [
>>>> sh:and (
>>>> [ sh:property [
>>>> sh:predicate ex:creator ; sh:minCount 1 ; sh:maxCount 1 ;
>>>> sh:pattern "^mailto:.*@a.gov" ] ]
>>>> [ sh:property [
>>>> sh:predicate ex:creator ; sh:minCount 1 ; sh:maxCount 1 ;
>>>> sh:pattern "^mailto:.*@b.mil" ] ]
>>>> ) .
>>>>
>>>> This also handily provides a semantics with a disjunctive OR so e.g.
>>>> <EmployeeShape> {
>>>> foaf:name . # either a foaf:name
>>>> | ( foaf:givenName . ; # or a pair of givenName
>>>> foaf:familyName . # and familyName
>>>> )
>>>> }
>>>> would not be satisfied with a partial pair:
>>>> <emp1> foaf:name "Alice Cooper" .
>>>> <emp1> foaf:familyName "Cooper" .
>>>> because the 1st disjoint doesn't use the 2nd triple and the 2nd
>>>> disjoint is missing a familyName.
>>>>
>>>> Users wanting either additive properties or disjunctive OR could
>>>> use the sh:partition operator.
>>>>
>>>>
>>>
>
>
--
Dimitris Kontokostas
Department of Computer Science, University of Leipzig & DBpedia Association
Projects: http://dbpedia.org, http://rdfunit.aksw.org,
http://aligned-project.eu
Homepage: http://aksw.org/DimitrisKontokostas
Research Group: AKSW/KILT http://aksw.org/Groups/KILT
Received on Friday, 12 August 2016 13:20:27 UTC