- From: Dimitris Kontokostas <kontokostas@informatik.uni-leipzig.de>
- Date: Fri, 12 Aug 2016 16:19:30 +0300
- To: Holger Knublauch <holger@topquadrant.com>
- Cc: public-data-shapes-wg <public-data-shapes-wg@w3.org>
- Message-ID: <CA+u4+a1K8=jK_MdhpmPmpR6fUYQCWMfSLe5wwMzRONvVmAagww@mail.gmail.com>
Hi Eric, If I remember correctly we initially had sh:xor but we ended up dropping it because XOR with more than two arguments was not so intuitive [1] [2]. Assuming we had an sh:onlyOneOf construct in SHACL, would your example get simplified? Dimitris [1] http://mathworld.wolfram.com/XOR.html [2] https://www.w3.org/2014/data-shapes/track/issues/85 On Fri, Aug 12, 2016 at 9:05 AM, Holger Knublauch <holger@topquadrant.com> wrote: > I think this will come down to a general design choice. Do we want to add > this very complex feature to the SHACL Core, or do we simply point people > at the SPARQL extension mechanism. QCRs cover quite a number of use cases. > Not every use case will be nicely expressible this way, but then OTOH many > people already know SPARQL and use it every day for these very kinds of > complex queries. Why do we need to reinvent everything into a higher-level > language, esp given that - without doubt - someone else will request yet > another design pattern that is not covered by our core language even with > the most general sh:partition feature. We need to stop somewhere. > > In SPARQL, the check for your scenario could be something like (untested): > > SELECT ?this > WHERE { > FILTER NOT EXISTS { > ?this dc:creator ?mil . > FILTER regex(?mil, "^mailto:.*@b.mil") . > ?this dc:creator ?gov . > FILTER (regex(?gov, "^mailto:.*@a.gov") || > EXISTS { ?gov foaf:mbox ?mbox . FILTER regex(?mbox, > "^mailto:.*@a.gov") }) > } > } > > and we don't need to reinvent further wheels. You can even combine these > with existing SHACL shapes, using sh:hasShape. Or you can make this nicer > with SHACL functions, e.g. > > SELECT ?this > WHERE { > FILTER NOT EXISTS { > ?this dc:creator ?mil . > FILTER ex:isMilEmail(?mil) . > ?this dc:creator ?gov . > FILTER (ex:isGovEmail(?gov) || EXISTS { ?gov foaf:mbox ?mbox . > FILTER ex:isGovEmail(?mbox) }) > } > } > > which is IMHO quite an acceptable Compact Syntax, only far more general. > > Holger > > > > On 12/08/2016 11:01, Eric Prud'hommeaux wrote: > >> * Holger Knublauch <holger@topquadrant.com> [2016-08-11 17:14+1000] >> >>> This looks like quite a mega feature, if sh:and and sh:or become >>> overloaded >>> with very different meaning, requiring a new execution algorithm etc. >>> What >>> about spawning this off into an extension, just like the SPARQL stuff is >>> in >>> an extension? >>> >>> Another option is to handle this on the Compact Syntax level, and produce >>> QCRs under the hood. Are there any scenarios where QCRs could not (in >>> principle) express your use cases? >>> >> The QCRs are relatively simple to generate but the universal >> constraint is problematic. Taking the 2nd example below with a >> >> shexc: >> <S> { >> ( dc:creator PATTERN "^mailto:.*@a.gov" # either creator a.gov >> | dc:creator { # or a creator some node >> foaf:mbox PATTERN "^mailto:.*@a.gov" # with a foaf:mbox of >> a.gov >> } >> ) ; >> dc:creator PATTERN "^mailto:.*@b.mil" # and one b.mil creator >> } >> >> dc:creator which may be an email or a bnode with a foaf:mbox, we can >> compose and additional universal constraint which limits the objects >> of dc:creator to the three enumerated forms: >> >> shacl: >> <S> >> sh:and ( >> [ sh:or ( >> # dc:creator PATTERN "^mailto:.*@a.gov" # either creator a.gov >> [ sh:property >> [ sh:predicate dc:creator ; sh:pattern "^mailto:.*@a.gov" ; >> sh:qualfiedMinCount 1; sh:qualfiedMaxCount 1 >> ] ] >> # | dc:creator { # or a creator some node >> # foaf:mbox PATTERN "^mailto:.*@a.gov" # with a foaf:mbox of a.gov >> # } >> [ sh:property >> [ sh:predicate dc:creator ; sh:shape [ >> sh:property >> [ sh:predicate foaf:mbox ; sh:pattern "^mailto:.*@a.gov" >> ; >> sh:minCount 1; sh:maxCount 1 >> ] ] ; sh:qualfiedMinCount 1; sh:qualfiedMaxCount 1 >> ] ] >> ) ] >> # dc:creator PATTERN "^mailto:.*@b.mil" # and one b.mil creator >> [ sh:property >> [ sh:predicate dc:creator ; sh:pattern "^mailto:.*@b.mil" ; >> sh:qualfiedMinCount 1; sh:qualfiedMaxCount 1 >> ] ] >> # universal constraint to handle closure >> [ sh:or ( >> [ sh:property >> [ sh:predicate dc:creator ; sh:pattern "^mailto:.*@a.gov" ; >> sh:qualfiedMinCount 1; sh:qualfiedMaxCount 1 >> ] ] >> [ sh:property >> [ sh:predicate dc:creator ; sh:shape [ >> sh:property >> [ sh:predicate foaf:mbox ; sh:pattern "^mailto:.*@a.gov" >> ; >> sh:minCount 1; sh:maxCount 1 >> ] ] ; sh:qualfiedMinCount 1; sh:qualfiedMaxCount 1 >> ] ] >> [ sh:property >> [ sh:predicate dc:creator ; sh:pattern "^mailto:.*@b.mil" ; >> sh:qualfiedMinCount 1; sh:qualfiedMaxCount 1 >> ] ] >> ) ] >> ) . >> >> This gets us part way there, but round-tripping may be impossible and >> it doesn't provide the OneOf OR behavior has described in the >> foaf:name example below. A typical example from clinical data (FHIR) >> analog to the (name|givenName,familyName) example would involve making >> sure that an osteoporotic spiral fracture had either a single >> component with a compound code ("46675001|73737008") or two >> components, each with one of those codes. For instance, ShEx's >> partitioning semantics allows us to easily capture this error: >> >> <Obs1> a fhir:Observation ; >> fhir:component >> [ fhir:code "46675001|73737008" ], >> [ fhir:code "46675001" ]. >> >> >> Holger >>> >>> >>> On 11/08/2016 11:32, Eric Prud'hommeaux wrote: >>> >>>> The current partition meets some additive use cases like: >>>> <S> { >>>> dc:creator PATTERN "^mailto:a.gov" ; # one a.gov creator >>>> dc:creator PATTERN "^mailto:b.mil" # and one b.mil creator >>>> } >>>> >>>> but not ones with any algebraic operators like: >>>> <S> { >>>> ( dc:creator PATTERN "^mailto:.*@a.gov" # either creator a.gov >>>> | dc:creator { # or a creator some node >>>> foaf:mbox PATTERN "^mailto:.*@a.gov" # with a foaf:mbox of >>>> a.gov >>>> } >>>> ) ; >>>> dc:creator PATTERN "^mailto:.*@b.mil" # and one b.mil creator >>>> } >>>> >>>> An alternative which would be to create a syntax to capture ShEx's >>>> partition semantics which say: >>>> Map the triples to the triple patterns with the same predicate. >>>> The node is valid with respect to a triple expression if there >>>> is a mapping of triple to triple pattern which satisfies the >>>> expression. >>>> For instance, the data >>>> <s> dc:creator <mailto:a@b.mil> . >>>> <s> dc:creator _:b1 . >>>> _:b1 foaf:mbox <mailto:b@a.gov> . >>>> satisfies the above pattern. >>>> >>>> I propose leveraging the current partition but allowing expressions: >>>> <S> sh:partition [ >>>> sh:and ( >>>> [ sh:property [ >>>> sh:predicate ex:creator ; sh:minCount 1 ; sh:maxCount 1 ; >>>> sh:pattern "^mailto:.*@a.gov" ] ] >>>> [ sh:property [ >>>> sh:predicate ex:creator ; sh:minCount 1 ; sh:maxCount 1 ; >>>> sh:pattern "^mailto:.*@b.mil" ] ] >>>> ) . >>>> >>>> This also handily provides a semantics with a disjunctive OR so e.g. >>>> <EmployeeShape> { >>>> foaf:name . # either a foaf:name >>>> | ( foaf:givenName . ; # or a pair of givenName >>>> foaf:familyName . # and familyName >>>> ) >>>> } >>>> would not be satisfied with a partial pair: >>>> <emp1> foaf:name "Alice Cooper" . >>>> <emp1> foaf:familyName "Cooper" . >>>> because the 1st disjoint doesn't use the 2nd triple and the 2nd >>>> disjoint is missing a familyName. >>>> >>>> Users wanting either additive properties or disjunctive OR could >>>> use the sh:partition operator. >>>> >>>> >>> > > -- Dimitris Kontokostas Department of Computer Science, University of Leipzig & DBpedia Association Projects: http://dbpedia.org, http://rdfunit.aksw.org, http://aligned-project.eu Homepage: http://aksw.org/DimitrisKontokostas Research Group: AKSW/KILT http://aksw.org/Groups/KILT
Received on Friday, 12 August 2016 13:20:27 UTC