- From: Peter F. Patel-Schneider <pfpschneider@gmail.com>
- Date: Mon, 28 Dec 2015 09:05:20 -0800
- To: RDF Data Shapes Working Group <public-data-shapes-wg@w3.org>
I took a look at "Validating and Describing Linked Data Portals using Shapes", as submitted to the Semantic Web Journal in early December. The current version of the submitted paper is currently available at www.semantic-web-journal.net/system/files/swj1260.pdf but this version has unknown differences from the version that I looked at. The submission extensively uses an example about measuring the World Wide Web's contribution to global development and human rights. This example comes from a previous paper by J. E. L. Gayo, H. Farham, J. C. Fernández, and J. M. Á. Rodríguez, "Representing statistical indexes as linked data including metadata about their computation process". The ShEx provided in the submission for the example has some significant unexplained differences from the example in the published paper. I was unable to determine the exact details of the example as there is no definition of the the formalism used for the bulk of information about the example - Figure 2 in the submission. Here is my reconstruction of the data model in Figure 2 plus the suborganization relationship and a little bit more from the earlier paper. I am using a ShEx-like syntax to capture the something like the form of the example, but this isn't necessarily ShEx, just a syntax to show the data model for the example. dataset { rdf:type ( qb:DataSet ) [1,1], qb:structure wf:DSD [1,1], rdfs:label xsd:string [1,1], dct:publisher @organization [1,1], qb:slice @ slice [1,*] } slice { rdf:type ( qb:Slice ) [1,1], qb:sliceStructure wf:sliceByArea [1,1], qb:observation @ observation [1,*], cex:indicator @ indicator [1,1] } organization { rdf:type ( org:Organization ) [1,1], rdfs:label xsd:string [1,1], foaf:homepage URI [1,1], org:hasSubOrganization @ organization [0,*] } observation { rdf:type ( qb:Observation ) [1,1], cex:value xsd:float [1,1], dcterms:issued xsd:dateTime [1,1], rdfs:label xsd:string [1,1], cex:ref-year xsd:gyear [1,1], cex:ref-area @country [1,1], cex:indicator @indicator [1,1], cex:computation @computation } indicator { rdf:type ( cex:Primary cex:Secondary ) [1,1], rdfs:label xsd:string [1,1], rdfs:comment xsd:string [1,1], skos:notation xsd:string [1,1], wf:provider @organization [1,1] } country { rdf:type ( wf:Country ) [1,1], wf:iso2 xsd:string [1,1], wf:iso3 xsd:string [1,1], rdf:label xsd:string [1,1] } computation { rdf:type ( cex:Computation ) [1,1] } The actual task to be performed is not described in the submission. It appears to me that the natural task to be done is to determine whether an RDF graph containing information about observations conforms to this data model, for some definition of conforms. This determination could be done in a number of ways in SHACL. The approach taken in the submission is to use a set of mutually recursive SHACL shapes. However, it seems to me that it would be better to instead use non-recursive SHACL shapes with scopes as follows: dataset sh:scopeClass qb:Dataset ; sh:property [ sh:predicate qb:structure; sh:class wf:DSD ; sh:minCount 1 ; sh:maxCount 1 ] ; sh:property [ sh:predicate rdfs:label; sh:datatype xsd:string ; sh:minCount 1 ; sh:maxCount 1 ] ; sh:property [ sh:predicate dct:publisher; sh:class xsd:string ; sh:minCount 1 ; sh:maxCount 1 ] ; sh:property [ sh:predicate qb:slice; sh:class qb:Slice ; sh:minCount 1 ] . slice sh:scopeClass qb:Slice ; sh:property [ sh:predicate qb:sliceStructure; sh:class wf:sliceByArea ; sh:minCount 1 ; sh:maxCount 1 ] ; sh:property [ sh:predicate qb:observation; sh:class qb:Observation ; sh:minCount 1 ] ; sh:constraint [ a sh:OrConstraint ; sh:shapes ( [ sh:property [ sh:predicate cex:indicator; sh:class cex:Primary ; sh:minCount 1 ; sh:maxCount 1 ] ] [ sh:property [ sh:predicate cex:indicator; sh:class cex:Secondary ; sh:minCount 1 ; sh:maxCount 1 ] ] ) . organization sh:scopeClass org:Organization ; sh:property [ sh:predicate rdfs:label; sh:datatype xsd:string ; sh:minCount 1 ; sh:maxCount 1 ] ; sh:property [ sh:predicate foaf:homepage; sh:nodeKind sh:IRI ; sh:minCount 1 ; sh:maxCount 1 ] ; sh:property [ sh:predicate org:hasSubOrganization; sh:class org:Organization ] . observation sh:scopeClass qb:Observation ; sh:property [ sh:predicate cex:value; sh:datatype xsd:float ; sh:minCount 1 ; sh:maxCount 1 ] ; sh:property [ sh:predicate dcterms:issued; sh:datatype xsd:dateTime ; sh:minCount 1 ; sh:maxCount 1 ] ; sh:property [ sh:predicate rdfs:label; sh:datatype xsd:string ; sh:minCount 1 ; sh:maxCount 1 ] ; sh:property [ sh:predicate cex:ref-year; sh:datatype xsd:gyear ; sh:minCount 1 ; sh:maxCount 1 ] ; sh:property [ sh:predicate cex:ref-area; sh:class wf:Country ; sh:minCount 1 ; sh:maxCount 1 ] ; sh:constraint [ a sh:OrConstraint ; sh:shapes ( [ sh:property [ sh:predicate cex:indicator; sh:class cex:Primary ; sh:minCount 1 ; sh:maxCount 1 ] ] [ sh:property [ sh:predicate cex:indicator; sh:class cex:Secondary ; sh:minCount 1 ; sh:maxCount 1 ] ] ) . sh:property [ sh:predicate cex:computation; sh:class cex:Computation ; sh:minCount 1 ; sh:maxCount 1 ] . indicator sh:scopeClass cex:Primary ; sh:scopeClass cex:Secondary ; sh:property [ sh:predicate rdfs:label; sh:datatype xsd:string ; sh:minCount 1 ; sh:maxCount 1 ] ; sh:property [ sh:predicate rdfs:comment; sh:datatype xsd:string ; sh:minCount 1 ; sh:maxCount 1 ] ; sh:property [ sh:predicate skos:notation; sh:datatype xsd:string ; sh:minCount 1 ; sh:maxCount 1 ] ; sh:property [ sh:predicate wf:provider; sh:class org:Organization ; sh:minCount 1 ; sh:maxCount 1 ] . country sh:scopeClass wf:Country ; sh:property [ sh:predicate wf:iso2; sh:datatype xsd:string ; sh:minCount 1 ; sh:maxCount 1 ] ; sh:property [ sh:predicate wf:iso3; sh:datatype xsd:string ; sh:minCount 1 ; sh:maxCount 1 ] ; sh:property [ sh:predicate rdfs:label; sh:datatype xsd:string ; sh:minCount 1 ; sh:maxCount 1 ] . The significant difference between the treatment here and the treatment in the submission is to use the type information as scopes, so that the shape of portions of the data is not mandated from its position as a value for some other portion of the data but is instead mandated by its type. This results in a difference of behaviour but I think that this SHACL encoding better matches the use here than the ShEx encoding does. The point here is mostly to show that a major example of recursive shapes does not appear to need recursive shapes, nor even shapes referring to other shapes at all. peter
Received on Monday, 28 December 2015 17:05:49 UTC