- From: Irene Polikoff <irene@topquadrant.com>
- Date: Thu, 09 Jun 2016 12:10:56 -0400
- To: "Peter F. Patel-Schneider" <pfpschneider@gmail.com>, Holger Knublauch <holger@topquadrant.com>, "public-data-shapes-wg@w3.org" <public-data-shapes-wg@w3.org>
This is about having only one primary key per class. If the use case description doesnąt make it clear, it should be changed. A more complex case would be about concatenating multiple properties to form a primary key. This is left to extensions. Irene Polikoff On 6/9/16, 10:13 AM, "Peter F. Patel-Schneider" <pfpschneider@gmail.com> wrote: >This is not about primary keys. A primary key has the special >characteristic >that there is only one primary key for each database table (which could be >read as RDFS class in the RDF case). There is no requirement here that >there >can be only one primary key property for a class. The use case in the UCR >document and accompanying documentation should be changed to fix this >error. > > >What will happen if this constraint component is used in an inverse >property >constraint? Well, in non-extended RDF the inverse values of properties >cannot >be strings so the check for the inverse property value being a string is >going >to be false. What will happen if this constraint component is used in a >node >constraint? Well, the node is already an IRI, so it can't be a string >literal. So both of these situations end up with a constraint that is >uniformly violated. A good-style checker for SHACL might flag these as >being >questionable, but there is no problem in allowing these as valid SHACL as >they >are perfectly well behaved. > >But what makes this constraint component useful? It is precisely that >SHACL >can validate that the instances of a class in an RDF graph have property >values that determine their IRIs. If this constraint component couldn't >be >used for this purpose then there would not be any reason to have it. > >Can this constraint component be used for other purposes? Sure, it could >be >used as input to a DB-extraction tool to tell it how to construct the >IRIs for >nodes that it creates. That certainly adds to the utility of this >constraint >component. Does this extra potential use mean that the tool has to do a >little bit of extra work if this constraint component was allowed in node >and >inverse property constraints? Probably a little bit. The tool needs to >find >the appropriate constraint components in a shapes graph, so it has to >comprehend SHACL shapes graphs. The only extra that it might need to do >is to >check that the constraint component occurs only in property constraints >because it can't do anything useful if it isn't. > > >As far as implementation goes, this constraint component is slightly >unusual >because it does two things. It checks that there is a single property >value >and then checks that the focus node's string value is suitable. > >As boilerplate solution could look something like > >SELECT $this WHERE { > { SELECT $this WHERE { > [boilerplate] > } HAVING ( COUNT ( DISTINCT ?value ) != 1 ) > } UNION { > [boilerplate] > BIND (CONCAT($uriStart, ENCODE_FOR_URI(str(?value))) AS ?uri) . > FILTER (str(?this) != ?uri) > } >} > >peter > > > >On 06/07/2016 07:11 PM, Holger Knublauch wrote: >> On 7/06/2016 0:54, Peter F. Patel-Schneider wrote: >>> Are you proposing that there should be a constraint component for >>>primary >>> keys? I don't see any description of how this would work in property >>> constraints so how can anyone determine how it would work in other >>> constraints. If you are not proposing that there be a constraint >>>component >>> for primary keys then I don't see any relevance to the discussion here. >> >> I would like to elaborate this use case a bit because it has >>implications on >> ISSUE-139. >> >> The "Primary Keys" feature has been mentioned in our Use Cases >>deliverable as >> UC25 [1] and in the wiki >> >> https://www.w3.org/2014/data-shapes/wiki/Primary_Keys_with_URI_Pattern >> >> This is a feature that has been in successful use based on SPIN in >>TopBraid >> products for a couple of years now. I have since ported it to SHACL as >>part of >> the DASH namespace. I have pasted a source code snippet to the bottom >>of this >> email. >> >> An example in SHACL would be >> >> ex:CountryShape >> a sh:Shape ; >> sh:scopeClass ex:Country ; >> sh:property [ >> sh:predicate ex:countryCode ; >> dash:uriStart "http://example.org/Country-" ; >> ] . >> >> A valid instance would be >> >> ex:Country-de >> rdf:type ex:Country ; >> ex:countryCode "de" . >> >> An invalid instance would be >> >> ex:Country-incorrect >> rdf:type ex:Country ; >> ex:countryCode "en" . >> >> The rule is that if a property has a dash:uriStart then that property >>serves >> as "primary key" which means that it must have exactly one value and >>the URI >> of the subject must be the uriStart + the value of the primary key, e.g. >> "ex:Country-de" for a primary key value of "de" and a uriStart of >>"ex:Country-". >> >> This constraint component highlights an important strength of SHACL: It >> provides machine-readable definitions of constraints that can be used >>for >> validation purposes but also many other use cases. In our particular >>case, we >> are using the primary key to produce well-formed URIs from a primary >>key value >> when a new instance is created. I have attached a screenshot of TopBraid >> showing a dialog in which the user just enters the name of a Country >>and its >> country code, and the URI gets produced automatically. >> >> The fact that this constraint can also be used for constraint checking >>is a >> great way of locking in a contract in the model, but the dash:uriStart >>triples >> can be queried by user interface tools too, and that use case is far >>more >> important than constraint validation here. For example, once we know >>that a >> primary key exists, looking up the URI for a given country is trivial >>and >> doesn't even require a query against the database. >> >> Clearly, this constraint component makes no sense for either inverse >> properties or node constraints. For inverse property constraints it >>makes no >> sense because the values could never be literals, making them >>unsuitable to >> build new URIs. For node constraints, it would even be very hard to >>even come >> up with an explanation of what it could possibly mean. In a node >>constraint, >> we would have $this = $value, i.e. the subject that is supposed to have >>a >> certain URI is the same as the value of the primary key! Such a >>constraint is >> impossible to fulfill because the URI would have to include itself >> recursively. This is just to show how silly such examples can become >>with the >> strict policy suggested here in this ticket. >> >> By having a sh:context triple (see below), the creator of such a >>constraint >> component can clearly communicate how this component is supposed to be >>used >> and where it should not even be offered as a choice. >> >> And the query of the validator (below) is quite irregular and would not >>fit >> into any of the proposed "boilerplate" generalizations. >> >> This example demonstrates that the proposals to have only one validator >>per >> constraint component, and to always allow every constraint component, >>make >> SHACL fail to address real-world requirements. >> >> Holger >> >> [1] https://www.w3.org/TR/shacl-ucr/#uc25-primary-keys-with-uri-patterns >> >> >> >> dash:PrimaryKeyConstraintComponent >> rdf:type sh:ConstraintComponent ; >> rdfs:comment "Enforces a constraint that the given property >>(sh:predicate) >> serves as primary key for all resources in the scope of the shape. If a >> property has been declared to be the primary key then each resource >>must have >> exactly one value for that property. Furthermore, the URIs of those >>resources >> must start with a given string (dash:uriStart), followed by the >>URL-encoded >> primary key value. For example if dash:uriStart is >> \"http://example.org/country-\" and the primary key for an instance is >>\"de\" >> then the URI must be \"http://example.org/country-de\". Finally, as a >>result >> of the URI policy, there can not be any other resource with the same >>value >> under the same primary key policy." ; >> rdfs:label "Primary key constraint component" ; >> sh:context sh:PropertyConstraint ; >> sh:labelTemplate "The property {?predicate} is the primary key and >>URIs >> start with {?uriStart}" ; >> sh:parameter [ >> sh:predicate dash:uriStart ; >> sh:datatype xsd:string ; >> sh:description "The start of the URIs of well-formed resources." ; >> sh:name "URI start" ; >> ] ; >> sh:propertyValidator [ >> rdf:type sh:SPARQLSelectValidator ; >> sh:select """SELECT $this ($this AS ?subject) $predicate (?value >>AS >> ?object) ?message >> WHERE { >> { >> FILTER NOT EXISTS { >> ?this $predicate ?any . >> } . >> BIND (\"Missing value for primary key property\" AS ?message) . >> } >> UNION >> { >> FILTER (dash:valueCount(?this, $predicate) > 1) . >> BIND (\"Multiple values of primary key property\" AS ?message) . >> } >> UNION >> { >> FILTER (dash:valueCount(?this, $predicate) = 1) . >> ?this $predicate ?value . >> BIND (CONCAT($uriStart, ENCODE_FOR_URI(str(?value))) AS ?uri) . >> FILTER (str(?this) != ?uri) . >> BIND (CONCAT(\"Primary key value \", str(?value), \" does not >>align >> with the expected URI \", ?uri) AS ?message) . >> } . >> }""" ; >> ] ; >> . >> >
Received on Thursday, 9 June 2016 16:11:32 UTC