- From: Holger Knublauch <holger@topquadrant.com>
- Date: Wed, 8 Jun 2016 12:11:26 +1000
- To: "public-data-shapes-wg@w3.org" <public-data-shapes-wg@w3.org>
- Message-ID: <955dbc96-9daa-b514-9ab7-20294c53dcf9@topquadrant.com>
On 7/06/2016 0:54, Peter F. Patel-Schneider wrote: > Are you proposing that there should be a constraint component for primary > keys? I don't see any description of how this would work in property > constraints so how can anyone determine how it would work in other > constraints. If you are not proposing that there be a constraint component > for primary keys then I don't see any relevance to the discussion here. I would like to elaborate this use case a bit because it has implications on ISSUE-139. The "Primary Keys" feature has been mentioned in our Use Cases deliverable as UC25 [1] and in the wiki https://www.w3.org/2014/data-shapes/wiki/Primary_Keys_with_URI_Pattern This is a feature that has been in successful use based on SPIN in TopBraid products for a couple of years now. I have since ported it to SHACL as part of the DASH namespace. I have pasted a source code snippet to the bottom of this email. An example in SHACL would be ex:CountryShape a sh:Shape ; sh:scopeClass ex:Country ; sh:property [ sh:predicate ex:countryCode ; dash:uriStart "http://example.org/Country-" ; ] . A valid instance would be ex:Country-de rdf:type ex:Country ; ex:countryCode "de" . An invalid instance would be ex:Country-incorrect rdf:type ex:Country ; ex:countryCode "en" . The rule is that if a property has a dash:uriStart then that property serves as "primary key" which means that it must have exactly one value and the URI of the subject must be the uriStart + the value of the primary key, e.g. "ex:Country-de" for a primary key value of "de" and a uriStart of "ex:Country-". This constraint component highlights an important strength of SHACL: It provides machine-readable definitions of constraints that can be used for validation purposes but also many other use cases. In our particular case, we are using the primary key to produce well-formed URIs from a primary key value when a new instance is created. I have attached a screenshot of TopBraid showing a dialog in which the user just enters the name of a Country and its country code, and the URI gets produced automatically. The fact that this constraint can also be used for constraint checking is a great way of locking in a contract in the model, but the dash:uriStart triples can be queried by user interface tools too, and that use case is far more important than constraint validation here. For example, once we know that a primary key exists, looking up the URI for a given country is trivial and doesn't even require a query against the database. Clearly, this constraint component makes no sense for either inverse properties or node constraints. For inverse property constraints it makes no sense because the values could never be literals, making them unsuitable to build new URIs. For node constraints, it would even be very hard to even come up with an explanation of what it could possibly mean. In a node constraint, we would have $this = $value, i.e. the subject that is supposed to have a certain URI is the same as the value of the primary key! Such a constraint is impossible to fulfill because the URI would have to include itself recursively. This is just to show how silly such examples can become with the strict policy suggested here in this ticket. By having a sh:context triple (see below), the creator of such a constraint component can clearly communicate how this component is supposed to be used and where it should not even be offered as a choice. And the query of the validator (below) is quite irregular and would not fit into any of the proposed "boilerplate" generalizations. This example demonstrates that the proposals to have only one validator per constraint component, and to always allow every constraint component, make SHACL fail to address real-world requirements. Holger [1] https://www.w3.org/TR/shacl-ucr/#uc25-primary-keys-with-uri-patterns dash:PrimaryKeyConstraintComponent rdf:type sh:ConstraintComponent ; rdfs:comment "Enforces a constraint that the given property (sh:predicate) serves as primary key for all resources in the scope of the shape. If a property has been declared to be the primary key then each resource must have exactly one value for that property. Furthermore, the URIs of those resources must start with a given string (dash:uriStart), followed by the URL-encoded primary key value. For example if dash:uriStart is \"http://example.org/country-\" and the primary key for an instance is \"de\" then the URI must be \"http://example.org/country-de\". Finally, as a result of the URI policy, there can not be any other resource with the same value under the same primary key policy." ; rdfs:label "Primary key constraint component" ; sh:context sh:PropertyConstraint ; sh:labelTemplate "The property {?predicate} is the primary key and URIs start with {?uriStart}" ; sh:parameter [ sh:predicate dash:uriStart ; sh:datatype xsd:string ; sh:description "The start of the URIs of well-formed resources." ; sh:name "URI start" ; ] ; sh:propertyValidator [ rdf:type sh:SPARQLSelectValidator ; sh:select """SELECT $this ($this AS ?subject) $predicate (?value AS ?object) ?message WHERE { { FILTER NOT EXISTS { ?this $predicate ?any . } . BIND (\"Missing value for primary key property\" AS ?message) . } UNION { FILTER (dash:valueCount(?this, $predicate) > 1) . BIND (\"Multiple values of primary key property\" AS ?message) . } UNION { FILTER (dash:valueCount(?this, $predicate) = 1) . ?this $predicate ?value . BIND (CONCAT($uriStart, ENCODE_FOR_URI(str(?value))) AS ?uri) . FILTER (str(?this) != ?uri) . BIND (CONCAT(\"Primary key value \", str(?value), \" does not align with the expected URI \", ?uri) AS ?message) . } . }""" ; ] ; .
Attachments
- image/png attachment: PrimaryKeyExample.PNG
Received on Wednesday, 8 June 2016 02:12:02 UTC