Re: Proposal for "Repeated Property" Requirement - sh:partition

Thank you Arthur,

Now (I think) it's clear for me.

Here are two examples that are not captured by the partition construct.

1) Disjunction is involved

For example (inspired by Eric's examples), we require that a person is 
identified by "^http://id.loc.gov/"-prefixed id,  and is either 
identified by "^http://viaf.org/"-prefixed id, or by 
"^http://example.org/"-prefixed id.

In ShEx it would be written as.

<BFPersonInterface1> {
       bf:identifiedBy IRI PATTERN "^http://id.loc.gov/" ,
       ( bf:identifiedBy IRI PATTERN "^http://viaf.org/"
         | bf:identifiedBy IRI PATTERN "^http://example.org/"
       )
  }

where the vertical bar | stands for disjunction (OR).

I am not sure whether Or and qualified cardinality in SHACL allow to 
express this. As far as I understand partition, it cannot express this.

This I think is a pattern that could be useful to have in several 
situations. We might additionally want to allow for having other 
bf:identifiedBy properties.

2) Ordering should not matter

Considering the following example (using ShExC syntax).
It requires that an issue is tested by one or more testers, and by one 
or more programmers.

my:IssueShape {
   ex:state (ex:accepted ex:resolved),
   ex:reproducedBy @my:TesterShape +,
   ex:reproducedBy @my:ProgrammerShape +
}

my:TesterShape {
   foaf:name xsd:string,
   ex:role (ex:testingRole)
}

my:ProgrammerShape {
   foaf:name xsd:string,
   ex:department (ex:ProgrammingDepartment)
}

Intuitively the following data should pass, as "Joe" is a tester, and 
"Jessy" is a programmer.

inst:Issue1
   ex:state ex:accepted ;
   ex:reproducedBy inst:Tester2 ;
   ex:reproducedBy inst:Testgrammer23 .

inst:joe
   foaf:name "Joe"
   ex:role ex:testingRole .

inst:jessy
   foaf:name "Jessy" ;
   ex:role ex:testingRole ;
   ex:department ex:ProgrammingDepartment .


Now, if the partition for ex:reproducedBy is

sh:Partition ( [sh:minCount 1; sh:valueShape @<TesterShape>][sh:minCount 
1; sh:valueShape @<ProgrammerShape>])

then the data would fail.

If the partition is (different order)

sh:Partition ( [sh:minCount 1; sh:valueShape @<TesterProgrammer>] 
[sh:minCount 1; sh:valueShape @<TesterShape>])

then the data would pass.

Note that there does not exist an ordering of the partition that ensures 
that all the (intuitively) correct data would pass. (If we take the 
second ordering above, then a problematic situation occurs if joe is a 
programmer, and jessy is both tester and programmer).


The question now is: how hard / easy it would be to adapt the Partition 
constraint in order to handle these.

My intuition is that for 2) (Ordering should not matter), one has to 
consider it as a set of QQC (not a list), and define the semantics of 
the partition independently on the ordering. For 1) (Disjunction 
involved), I don't have any idea.


Iovka

Le 30/09/2015 02:04, Arthur Ryman a écrit :
> Iovka,
>
> sh:partition is a Property Contraint (or Inverse Property Constraint),
> like sh:minCount, sh:maxCount, sh:nodeKind, etc.
>
> sh:partition is evaluated on the set of all nodes that are values of
> the property (or inverse property). It can be either open or closed,
> depending on whether the last sh:QCC is a catchall constraint. The
> empty QCC [ ] implies no constraints, so if that's the last one, then
> the partition is in effect open.
>
> For example,
>
> sh:Partition ( [sh:minCount 2; sh:nodeKind sh:IRI]  [ ] )
>
> is true if the focus node has 2 or more IRI values, and anything else.
>
> -- Arthur
>
> On Tue, Sep 29, 2015 at 1:14 AM, Iovka Boneva
> <iovka.boneva@univ-lille1.fr> wrote:
>> Arthur,
>>
>> I would like to be sure to understand how the Partition constraint would
>> work. Here are few questions:
>>
>> - is Partition a constraint in the same way as the other constraints
>> (sh:property, sh:NotConstraint, sh:AndConstraint, sh:OrConstraint, etc.)?
>> Can it be combined with Or, And, Not ? How does it interact with these ?
>>
>> - is a Partition always evaluated on the whole neighbourhood ? If yes, does
>> this mean that this is a "closed" costraint, in the sense that it does not
>> allow any additional arcs than those specified in the QQCs of the Partition
>> ?
>>
>> Thank you,
>> Iovka
>>
>>
>> Le 26/09/2015 01:13, Arthur Ryman a écrit :
>>> I've been following the discussion about repeated properties and
>>> qualified cardinality constraints, and would like to propose a new
>>> SHACL language element, sh:partition, that I believe will satisfy the
>>> requirements.
>>>
>>> I the use cases suggest that SHACL needs a way to say that a set of
>>> nodes must be partitioned into a certain number of disjoint subsets.
>>> Each subset contains nodes that satisfy certain constraints. Each
>>> subset must satisfy certain cardinality constraints.
>>>
>>> In the case of repeated properties, we are looking at the set of all
>>> values for a given property (or inverse property) of a given focus
>>> node. Sets of nodes occur in other contexts and be need to be
>>> similarly constrained.
>>>
>>> It would be a very good thing if a SHACL processor could efficiently
>>> determine if a given set of nodes could be partitioned according to a
>>> given partition spec.
>>>
>>> SHACL already has sh:minCount and sh:maxCount properties which apply
>>> to sets of nodes.
>>>
>>> SHACL also already has many other properties that define constraints
>>> on a given node. These are tests or checks that apply to a node and
>>> are either true or false. Holger listed many of them, e.g.
>>> - sh:allowedValues
>>> - sh:class
>>> - sh:datatype
>>> - sh:directType
>>> - sh:minLength
>>> - sh:maxLength
>>> - sh:nodeKind
>>> - sh:maxExclusive etc
>>> - sh:pattern
>>>
>>> I propose to define a new RDF type, sh:QCC for things that specify
>>> qualified cardinality constraints. However, sh:QCC will normally be
>>> understood from the context and do not need to appear explicitly in
>>> the shapes graph.
>>>
>>> A sh:QCC may have:
>>> - zero or one sh:minCount
>>> - zero or one sh:maxCount
>>> - zero or more node constraints, for the following list (and possibly
>>> others that make sense)
>>> - sh:shape
>>> - sh:allowedValues
>>> - sh:class
>>> - sh:datatype
>>> - sh:directType
>>> - sh:minLength
>>> - sh:maxLength
>>> - sh:nodeKind
>>> - sh:maxExclusive etc
>>> - sh:pattern
>>>
>>> A partition is specified by an rdf:List of sh:QCC nodes. Define
>>> sh:Partition to be this subclass of rdf:List. Again, sh:Partition need
>>> no appear explicitly.
>>> A constraint may have zero or more sh:partition properties whose
>>> values are sh:Partition nodes. All must be satisfied.
>>>
>>> The interpretation of a sh:Partition node as a constraint is as follows:
>>>
>>> Let the given set of nodes be X.
>>> Let the sh:Partition node be the list P = (qcc1, qcc2, ..., qccn).
>>>
>>> For each qcc in P do the following:
>>>      Let Y be the subset of X that satisfies the node constraints in qcc.
>>>      If Y violates the cardinality constraints of qcc then report a
>>> violation and break.
>>>      Otherwise remove Y from X and continue.
>>> End for.
>>> If X is not empty then report a violation.
>>> Otherwise report that P is satisfied.
>>>
>>> Note that this is a greedy algorithm. Each qcc in the list is matched
>>> to the fullest extent. Nodes that match one qcc are removed from
>>> further consideration. Also, the qcc's are checked in the order given
>>> in the list so there is no combinatorial explosion.
>>>
>>> Eric proposed the following example [1]:
>>>
>>> <BFPersonInterface1> sh:property [
>>>         sh:predicate bf:identifiedBy ; sh:pattern "^http://id.loc.gov/" ;
>>>         sh:minCount 1 ; sh:maxCount 1
>>>       ], [
>>>         sh:predicate bf:identifiedBy ; sh:pattern "^http://viaf.org/" ;
>>>         sh:minCount 1
>>>       ] .
>>>
>>> In my proposal, this becomes:
>>>
>>> <BFPersonInterface1> sh:property [
>>>         sh:predicate bf:identifiedBy ;
>>>         sh:partition (
>>>            [sh:pattern "^http://id.loc.gov/" ; sh:minCount 1 ; sh:maxCount
>>> 1],
>>>            [sh:pattern "^http://viaf.org/" ; sh:minCount 1 ]
>>>         ) .
>>>
>>> [1]
>>> https://lists.w3.org/Archives/Public/public-data-shapes-wg/2015Sep/0107.html
>>>
>>> -- Arthur
>>>
>>
>> --
>> Iovka Boneva
>> Associate professor (MdC) Université de Lille
>> http://www.cristal.univ-lille.fr/~boneva/
>> +33 6 95 75 70 25
>>
>>


-- 
Iovka Boneva
Associate professor (MdC) Université de Lille
http://www.cristal.univ-lille.fr/~boneva/
+33 6 95 75 70 25

Received on Thursday, 1 October 2015 14:35:58 UTC