- From: Holger Knublauch <holger@topquadrant.com>
- Date: Wed, 8 Jun 2016 12:11:26 +1000
- To: "public-data-shapes-wg@w3.org" <public-data-shapes-wg@w3.org>
- Message-ID: <955dbc96-9daa-b514-9ab7-20294c53dcf9@topquadrant.com>
On 7/06/2016 0:54, Peter F. Patel-Schneider wrote:
> Are you proposing that there should be a constraint component for primary
> keys? I don't see any description of how this would work in property
> constraints so how can anyone determine how it would work in other
> constraints. If you are not proposing that there be a constraint component
> for primary keys then I don't see any relevance to the discussion here.
I would like to elaborate this use case a bit because it has
implications on ISSUE-139.
The "Primary Keys" feature has been mentioned in our Use Cases
deliverable as UC25 [1] and in the wiki
https://www.w3.org/2014/data-shapes/wiki/Primary_Keys_with_URI_Pattern
This is a feature that has been in successful use based on SPIN in
TopBraid products for a couple of years now. I have since ported it to
SHACL as part of the DASH namespace. I have pasted a source code snippet
to the bottom of this email.
An example in SHACL would be
ex:CountryShape
a sh:Shape ;
sh:scopeClass ex:Country ;
sh:property [
sh:predicate ex:countryCode ;
dash:uriStart "http://example.org/Country-" ;
] .
A valid instance would be
ex:Country-de
rdf:type ex:Country ;
ex:countryCode "de" .
An invalid instance would be
ex:Country-incorrect
rdf:type ex:Country ;
ex:countryCode "en" .
The rule is that if a property has a dash:uriStart then that property
serves as "primary key" which means that it must have exactly one value
and the URI of the subject must be the uriStart + the value of the
primary key, e.g. "ex:Country-de" for a primary key value of "de" and a
uriStart of "ex:Country-".
This constraint component highlights an important strength of SHACL: It
provides machine-readable definitions of constraints that can be used
for validation purposes but also many other use cases. In our particular
case, we are using the primary key to produce well-formed URIs from a
primary key value when a new instance is created. I have attached a
screenshot of TopBraid showing a dialog in which the user just enters
the name of a Country and its country code, and the URI gets produced
automatically.
The fact that this constraint can also be used for constraint checking
is a great way of locking in a contract in the model, but the
dash:uriStart triples can be queried by user interface tools too, and
that use case is far more important than constraint validation here. For
example, once we know that a primary key exists, looking up the URI for
a given country is trivial and doesn't even require a query against the
database.
Clearly, this constraint component makes no sense for either inverse
properties or node constraints. For inverse property constraints it
makes no sense because the values could never be literals, making them
unsuitable to build new URIs. For node constraints, it would even be
very hard to even come up with an explanation of what it could possibly
mean. In a node constraint, we would have $this = $value, i.e. the
subject that is supposed to have a certain URI is the same as the value
of the primary key! Such a constraint is impossible to fulfill because
the URI would have to include itself recursively. This is just to show
how silly such examples can become with the strict policy suggested here
in this ticket.
By having a sh:context triple (see below), the creator of such a
constraint component can clearly communicate how this component is
supposed to be used and where it should not even be offered as a choice.
And the query of the validator (below) is quite irregular and would not
fit into any of the proposed "boilerplate" generalizations.
This example demonstrates that the proposals to have only one validator
per constraint component, and to always allow every constraint
component, make SHACL fail to address real-world requirements.
Holger
[1] https://www.w3.org/TR/shacl-ucr/#uc25-primary-keys-with-uri-patterns
dash:PrimaryKeyConstraintComponent
rdf:type sh:ConstraintComponent ;
rdfs:comment "Enforces a constraint that the given property
(sh:predicate) serves as primary key for all resources in the scope of
the shape. If a property has been declared to be the primary key then
each resource must have exactly one value for that property.
Furthermore, the URIs of those resources must start with a given string
(dash:uriStart), followed by the URL-encoded primary key value. For
example if dash:uriStart is \"http://example.org/country-\" and the
primary key for an instance is \"de\" then the URI must be
\"http://example.org/country-de\". Finally, as a result of the URI
policy, there can not be any other resource with the same value under
the same primary key policy." ;
rdfs:label "Primary key constraint component" ;
sh:context sh:PropertyConstraint ;
sh:labelTemplate "The property {?predicate} is the primary key and
URIs start with {?uriStart}" ;
sh:parameter [
sh:predicate dash:uriStart ;
sh:datatype xsd:string ;
sh:description "The start of the URIs of well-formed resources." ;
sh:name "URI start" ;
] ;
sh:propertyValidator [
rdf:type sh:SPARQLSelectValidator ;
sh:select """SELECT $this ($this AS ?subject) $predicate (?value
AS ?object) ?message
WHERE {
{
FILTER NOT EXISTS {
?this $predicate ?any .
} .
BIND (\"Missing value for primary key property\" AS ?message) .
}
UNION
{
FILTER (dash:valueCount(?this, $predicate) > 1) .
BIND (\"Multiple values of primary key property\" AS ?message) .
}
UNION
{
FILTER (dash:valueCount(?this, $predicate) = 1) .
?this $predicate ?value .
BIND (CONCAT($uriStart, ENCODE_FOR_URI(str(?value))) AS ?uri) .
FILTER (str(?this) != ?uri) .
BIND (CONCAT(\"Primary key value \", str(?value), \" does not
align with the expected URI \", ?uri) AS ?message) .
} .
}""" ;
] ;
.
Attachments
- image/png attachment: PrimaryKeyExample.PNG
Received on Wednesday, 8 June 2016 02:12:02 UTC