Re: ISSUE-139: The primary keys use case

This is about having only one primary key per class. If the use case
description doesnąt make it clear, it should be changed. A more complex
case would be about concatenating multiple properties to form a primary
key. This is left to extensions.

Irene Polikoff





On 6/9/16, 10:13 AM, "Peter F. Patel-Schneider" <pfpschneider@gmail.com>
wrote:

>This is not about primary keys.  A primary key has the special
>characteristic
>that there is only one primary key for each database table (which could be
>read as RDFS class in the RDF case).  There is no requirement here that
>there
>can be only one primary key property for a class.  The use case in the UCR
>document and accompanying documentation should be changed to fix this
>error.
>
>
>What will happen if this constraint component is used in an inverse
>property
>constraint?  Well, in non-extended RDF the inverse values of properties
>cannot
>be strings so the check for the inverse property value being a string is
>going
>to be false.  What will happen if this constraint component is used in a
>node
>constraint?  Well, the node is already an IRI, so it can't be a string
>literal.  So both of these situations end up with a constraint that is
>uniformly violated.  A good-style checker for SHACL might flag these as
>being
>questionable, but there is no problem in allowing these as valid SHACL as
>they
>are perfectly well behaved.
>
>But what makes this constraint component useful?  It is precisely that
>SHACL
>can validate that the instances of a class in an RDF graph have property
>values that determine their IRIs.  If this constraint component couldn't
>be
>used for this purpose then there would not be any reason to have it.
>
>Can this constraint component be used for other purposes?  Sure, it could
>be
>used as input to a DB-extraction tool to tell it how to construct the
>IRIs for
>nodes that it creates.  That certainly adds to the utility of this
>constraint
>component.  Does this extra potential use mean that the tool has to do a
>little bit of extra work if this constraint component was allowed in node
>and
>inverse property constraints?  Probably a little bit.  The tool needs to
>find
>the appropriate constraint components in a shapes graph, so it has to
>comprehend SHACL shapes graphs.  The only extra that it might need to do
>is to
>check that the constraint component occurs only in property constraints
>because it can't do anything useful if it isn't.
>
>
>As far as implementation goes, this constraint component is slightly
>unusual
>because it does two things.  It checks that there is a single property
>value
>and then checks that the focus node's string value is suitable.
>
>As boilerplate solution could look something like
>
>SELECT $this WHERE {
> { SELECT $this WHERE {
>     [boilerplate]
>   } HAVING ( COUNT ( DISTINCT ?value ) != 1 )
> } UNION {
>   [boilerplate]
>   BIND (CONCAT($uriStart, ENCODE_FOR_URI(str(?value))) AS ?uri) .
>   FILTER (str(?this) != ?uri)
> }
>}
>
>peter
>
>
>
>On 06/07/2016 07:11 PM, Holger Knublauch wrote:
>> On 7/06/2016 0:54, Peter F. Patel-Schneider wrote:
>>> Are you proposing that there should be a constraint component for
>>>primary
>>> keys?  I don't see any description of how this would work in property
>>> constraints so how can anyone determine how it would work in other
>>> constraints.  If you are not proposing that there be a constraint
>>>component
>>> for primary keys then I don't see any relevance to the discussion here.
>> 
>> I would like to elaborate this use case a bit because it has
>>implications on
>> ISSUE-139.
>> 
>> The "Primary Keys" feature has been mentioned in our Use Cases
>>deliverable as
>> UC25 [1] and in the wiki
>> 
>> https://www.w3.org/2014/data-shapes/wiki/Primary_Keys_with_URI_Pattern
>> 
>> This is a feature that has been in successful use based on SPIN in
>>TopBraid
>> products for a couple of years now. I have since ported it to SHACL as
>>part of
>> the DASH namespace. I have pasted a source code snippet to the bottom
>>of this
>> email.
>> 
>> An example in SHACL would be
>> 
>> ex:CountryShape
>>     a sh:Shape ;
>>     sh:scopeClass ex:Country ;
>>     sh:property [
>>         sh:predicate ex:countryCode ;
>>         dash:uriStart "http://example.org/Country-" ;
>>     ] .
>> 
>> A valid instance would be
>> 
>> ex:Country-de
>>   rdf:type ex:Country ;
>>   ex:countryCode "de" .
>> 
>> An invalid instance would be
>> 
>> ex:Country-incorrect
>>   rdf:type ex:Country ;
>>   ex:countryCode "en" .
>> 
>> The rule is that if a property has a dash:uriStart then that property
>>serves
>> as "primary key" which means that it must have exactly one value and
>>the URI
>> of the subject must be the uriStart + the value of the primary key, e.g.
>> "ex:Country-de" for a primary key value of "de" and a uriStart of
>>"ex:Country-".
>> 
>> This constraint component highlights an important strength of SHACL: It
>> provides machine-readable definitions of constraints that can be used
>>for
>> validation purposes but also many other use cases. In our particular
>>case, we
>> are using the primary key to produce well-formed URIs from a primary
>>key value
>> when a new instance is created. I have attached a screenshot of TopBraid
>> showing a dialog in which the user just enters the name of a Country
>>and its
>> country code, and the URI gets produced automatically.
>> 
>> The fact that this constraint can also be used for constraint checking
>>is a
>> great way of locking in a contract in the model, but the dash:uriStart
>>triples
>> can be queried by user interface tools too, and that use case is far
>>more
>> important than constraint validation here. For example, once we know
>>that a
>> primary key exists, looking up the URI for a given country is trivial
>>and
>> doesn't even require a query against the database.
>> 
>> Clearly, this constraint component makes no sense for either inverse
>> properties or node constraints. For inverse property constraints it
>>makes no
>> sense because the values could never be literals, making them
>>unsuitable to
>> build new URIs. For node constraints, it would even be very hard to
>>even come
>> up with an explanation of what it could possibly mean. In a node
>>constraint,
>> we would have $this = $value, i.e. the subject that is supposed to have
>>a
>> certain URI is the same as the value of the primary key! Such a
>>constraint is
>> impossible to fulfill because the URI would have to include itself
>> recursively. This is just to show how silly such examples can become
>>with the
>> strict policy suggested here in this ticket.
>> 
>> By having a sh:context triple (see below), the creator of such a
>>constraint
>> component can clearly communicate how this component is supposed to be
>>used
>> and where it should not even be offered as a choice.
>> 
>> And the query of the validator (below) is quite irregular and would not
>>fit
>> into any of the proposed "boilerplate" generalizations.
>> 
>> This example demonstrates that the proposals to have only one validator
>>per
>> constraint component, and to always allow every constraint component,
>>make
>> SHACL fail to address real-world requirements.
>> 
>> Holger
>> 
>> [1] https://www.w3.org/TR/shacl-ucr/#uc25-primary-keys-with-uri-patterns
>> 
>> 
>> 
>> dash:PrimaryKeyConstraintComponent
>>   rdf:type sh:ConstraintComponent ;
>>   rdfs:comment "Enforces a constraint that the given property
>>(sh:predicate)
>> serves as primary key for all resources in the scope of the shape. If a
>> property has been declared to be the primary key then each resource
>>must have
>> exactly one value for that property. Furthermore, the URIs of those
>>resources
>> must start with a given string (dash:uriStart), followed by the
>>URL-encoded
>> primary key value. For example if dash:uriStart is
>> \"http://example.org/country-\" and the primary key for an instance is
>>\"de\"
>> then the URI must be \"http://example.org/country-de\". Finally, as a
>>result
>> of the URI policy, there can not be any other resource with the same
>>value
>> under the same primary key policy." ;
>>   rdfs:label "Primary key constraint component" ;
>>   sh:context sh:PropertyConstraint ;
>>   sh:labelTemplate "The property {?predicate} is the primary key and
>>URIs
>> start with {?uriStart}" ;
>>   sh:parameter [
>>       sh:predicate dash:uriStart ;
>>       sh:datatype xsd:string ;
>>       sh:description "The start of the URIs of well-formed resources." ;
>>       sh:name "URI start" ;
>>     ] ;
>>   sh:propertyValidator [
>>       rdf:type sh:SPARQLSelectValidator ;
>>       sh:select """SELECT $this ($this AS ?subject) $predicate (?value
>>AS
>> ?object) ?message
>> WHERE {
>>     {
>>         FILTER NOT EXISTS {
>>             ?this $predicate ?any .
>>         } .
>>         BIND (\"Missing value for primary key property\" AS ?message) .
>>     }
>>     UNION
>>     {
>>         FILTER (dash:valueCount(?this, $predicate) > 1) .
>>         BIND (\"Multiple values of primary key property\" AS ?message) .
>>     }
>>     UNION
>>     {
>>         FILTER (dash:valueCount(?this, $predicate) = 1) .
>>         ?this $predicate ?value .
>>         BIND (CONCAT($uriStart, ENCODE_FOR_URI(str(?value))) AS ?uri) .
>>         FILTER (str(?this) != ?uri) .
>>         BIND (CONCAT(\"Primary key value \", str(?value), \" does not
>>align
>> with the expected URI \", ?uri) AS ?message) .
>>     } .
>> }""" ;
>>     ] ;
>> .
>> 
>

Received on Thursday, 9 June 2016 16:11:32 UTC