W3C RDF Data Cube dimensions definition

Date: Thu, 2 Feb 2017 17:01:34 +0000
The formal normative document at https://www.w3.org/TR/vocab-data-cube/ does not seem to actually define 'dimensions' or 'multi-dimensional', other than as distinguishing aspects of a 'hypercube' of 'observations'.

Neither does the underlying SDMX ISO statistical document at https://sdmx.org/?page_id=5008 :
 "These metadata values and  concepts  can  be  understood  as  the  named  dimensions  of  a  multi-dimensional co-ordinate system, describing what is often called a "cube" of data."

Section 5.2 is the most informative, but not normative:

"A statistical data set comprises a collection of observations made at some points across some logical space. The collection can be characterized by a set of dimensions that define what the observation applies to (e.g. time, area, gender) along with metadata describing what has been measured (e.g. economic activity, population), how it was measured and how the observations are expressed (e.g. units, multipliers, status). We can think of the statistical data set as a multi-dimensional space, or hyper-cube, indexed by those dimensions. This space is commonly referred to as a cube for short; though the name shouldn't be taken literally, it is not meant to imply that there are exactly three dimensions (there can be more or fewer) nor that all the dimensions are somehow similar in size.

A cube is organized according to a set of dimensions, attributes and measures. ...

... The dimension components serve to identify the observations. A set of values for all the dimension components is sufficient to identify a single observation. Examples of dimensions include the time to which the observation applies, or a geographic region which the observation covers......"

Section 5.3 gives us the freedom to specialise:

".... In statistical applications it is common to work with slices in which a single dimension is left unspecified. In particular, to refer to such slices in which the single free dimension is time as Time Series and to refer slices along non-time dimensions as Sections. Within the Data Cube vocabulary we allow arbitrary dimensionality slices and do not give different names to particular types of slice. Such sub-classes of slice could be added in extension vocabularies."

Section 6 allows us to optionally order dimensions.

Section 6.1 The Data Cube vocabulary represents the dimensions, attributes and measures as RDF properties.

The Integrity Check Sections IC-4/5 states that dimensions have range, and every dimension with a range of 'concept' has a code list.

And what is interesting, 'dimension' is sub-classed into:
1. Dimension
2. measureDimension
3. timeDimension

A measureDimension indicates variables or types of data of interest, so our CRSs are associated with Dimension, with the proviso that this includes proper temporal CRSs. I posit that Calendars are NOT CRSs, but complex entities in their own right and fit into the timeDimension, as this has examples of dateTime, etc. 

I.e. the Dimension has a specialisation with the ideas of a single axis, one origin, +ve and -ve directions, and one Unit of Measure.  

The timeDimension 'calendars have complicated cycles of durations, counts  and units.

Sending this though I have not finished thinking about it or delving yet.

Do we  need to have a normative definition of 'dimension'?


