- From: Arthur Ryman <arthur.ryman@gmail.com>
- Date: Fri, 25 Sep 2015 19:13:26 -0400
- To: "public-data-shapes-wg@w3.org" <public-data-shapes-wg@w3.org>
I've been following the discussion about repeated properties and qualified cardinality constraints, and would like to propose a new SHACL language element, sh:partition, that I believe will satisfy the requirements. I the use cases suggest that SHACL needs a way to say that a set of nodes must be partitioned into a certain number of disjoint subsets. Each subset contains nodes that satisfy certain constraints. Each subset must satisfy certain cardinality constraints. In the case of repeated properties, we are looking at the set of all values for a given property (or inverse property) of a given focus node. Sets of nodes occur in other contexts and be need to be similarly constrained. It would be a very good thing if a SHACL processor could efficiently determine if a given set of nodes could be partitioned according to a given partition spec. SHACL already has sh:minCount and sh:maxCount properties which apply to sets of nodes. SHACL also already has many other properties that define constraints on a given node. These are tests or checks that apply to a node and are either true or false. Holger listed many of them, e.g. - sh:allowedValues - sh:class - sh:datatype - sh:directType - sh:minLength - sh:maxLength - sh:nodeKind - sh:maxExclusive etc - sh:pattern I propose to define a new RDF type, sh:QCC for things that specify qualified cardinality constraints. However, sh:QCC will normally be understood from the context and do not need to appear explicitly in the shapes graph. A sh:QCC may have: - zero or one sh:minCount - zero or one sh:maxCount - zero or more node constraints, for the following list (and possibly others that make sense) - sh:shape - sh:allowedValues - sh:class - sh:datatype - sh:directType - sh:minLength - sh:maxLength - sh:nodeKind - sh:maxExclusive etc - sh:pattern A partition is specified by an rdf:List of sh:QCC nodes. Define sh:Partition to be this subclass of rdf:List. Again, sh:Partition need no appear explicitly. A constraint may have zero or more sh:partition properties whose values are sh:Partition nodes. All must be satisfied. The interpretation of a sh:Partition node as a constraint is as follows: Let the given set of nodes be X. Let the sh:Partition node be the list P = (qcc1, qcc2, ..., qccn). For each qcc in P do the following: Let Y be the subset of X that satisfies the node constraints in qcc. If Y violates the cardinality constraints of qcc then report a violation and break. Otherwise remove Y from X and continue. End for. If X is not empty then report a violation. Otherwise report that P is satisfied. Note that this is a greedy algorithm. Each qcc in the list is matched to the fullest extent. Nodes that match one qcc are removed from further consideration. Also, the qcc's are checked in the order given in the list so there is no combinatorial explosion. Eric proposed the following example [1]: <BFPersonInterface1> sh:property [ sh:predicate bf:identifiedBy ; sh:pattern "^http://id.loc.gov/" ; sh:minCount 1 ; sh:maxCount 1 ], [ sh:predicate bf:identifiedBy ; sh:pattern "^http://viaf.org/" ; sh:minCount 1 ] . In my proposal, this becomes: <BFPersonInterface1> sh:property [ sh:predicate bf:identifiedBy ; sh:partition ( [sh:pattern "^http://id.loc.gov/" ; sh:minCount 1 ; sh:maxCount 1], [sh:pattern "^http://viaf.org/" ; sh:minCount 1 ] ) . [1] https://lists.w3.org/Archives/Public/public-data-shapes-wg/2015Sep/0107.html -- Arthur
Received on Friday, 25 September 2015 23:13:54 UTC