- From: Dave Reynolds <dave.e.reynolds@gmail.com>
- Date: Thu, 28 Feb 2013 15:04:02 +0000
- To: Government Linked Data Working Group <public-gld-wg@w3.org>
ISSUE-29 is the one that entails the most work but which we don't want to drop. The issue is about defining constraints on what a "well formed" Data Cube should look like when published, which consuming software can then rely on. This is about more than the usual semantic constraints on the vocabulary because in this domain we also want some notions of closed-world completeness. If expected assertions are missing that's not an inconsistency from an OWL point of view but is an interoperability problem for Data Cube. I think we need to agree on: 1. roughly what constraints we wish to impose 2. the approach for expressing constraints 3. then the precise details of the set of constraints This note is to sketch out a rough approach before getting too far down into the details. # Outline approach 1. Define a set of expansion rules which can derive a full Data Cube from abbreviated format. 2. Define a "well formed abbreviated Data Cube" as one which, when expanded using the expansion rules, is a "well formed Data Cube". 3. The expansion rules will be expressed as SPARQL 1.1 CONSTRUCT expressions. Similar to what we did on ORG. We define an order (possibly iterative but that may not be needed) in which the CONSTRUCT expressions are applied. At each stage the result is the union of the source Data Cube graph and the graph generated by the CONSTRUCT. 4. The primary expansions rules will be: - expansion of components which have been abbreviated through use of qb:componentAttachment - propagation of dimensions given on a qb:Slice to each qb:observation within that slice - deduction of some implicit rdf:type assertions from domain/range constraints - (possibly) deduction of default value for qb:componentRequired for any component which is not explicitly marked as optional (some details around measures on multi-measure cubes to sort here) 5. A "well formed Data Cube" is an RDF graph which uses elements from the RDF Data Cube vocabulary and for which every SPARQL ASK query in a set of validation rules (see later) returns false. In addition the RDF graph should be consistent under RDF D-entailment using the XSD datatype map (as defined in RDF Semantics). 6. Within the context of a particular data interchange additional constraints may be imposed beyond Data Cube well formedness. In particular such interchange may require that the Data Cube graph, together with some import closure of ontologies, also be consistent under OWL with RDF or DL semantics. The Data Cube specification itself does not require this or specify any mechanism to declare the relevant semantics or ontologies to import beyond those available in existing W3C standards. # Sketch of validation checks a. Every qb:Observation has precisely one qb:dataSet property (no orphaned observations). b. Every qb:DataSet has precisely one qb:structure property (all data sets have a data structure definition) c. For every qb:Observation o :- For every qb:component (cp) within the qb:DataStructureDefinition of the qb:DataSet of o which is marked as qb:componentRequired true :- o has a value for cp [This will need some modifications in the case of multi-measure cubes which use MeasureDimensions, working through the details.] d. For each qb:Slice which has a qb:sliceStructure value (sk) :- for each qb:componentProperty of sk (cp) :- the qb:Slice should have a value for cp e. If the Data Cube is a measure dimension multi-measure cube then every qb:Observation has a value for qb:measureType and a value for only one measure. g. Every qb:DimensionProperty must have a declared rdfs:range and if that range is skos:Concept it must have an associated qb:codeList. The range may be an xsd data type Does that seem like a reasonable approach? Any immediately obvioius problems or holes with the outline checks? Dave [1] http://www.w3.org/2011/gld/track/issues/29
Received on Thursday, 28 February 2013 15:04:32 UTC