Re: W3C RDF Data Cube dimensions definition

Hi Chris,

I think it’s quite difficult to define “dimension” properly and concisely without getting circular (I tried once for some other documentation but I couldn’t find a good solution. It reminds me of a sentence I read in an old computer graphics book that tried to define “line” by saying “A line is a line between two points”…).

But I think it’s useful to acknowledge that “dimension” can have different connotations – i.e. it’s not always a physical dimension, and that you can have a 1-dimensional geometry within a 3-dimensional physical space.

I would think that most people will understand what we mean by “dimension” from a discursive description and a few examples.

Cheers,Jon



On 03/02/2017 08:28, "Linda van den Brink" <l.vandenbrink@geonovum.nl> wrote:

    Thanks for researching this! 
    
    To answer your question way at the bottom, I don't thing we need a _normative_ definition. But a definition that makes clear what we mean when we talk about dimensions in the BP, instead of just mentioning it casually as we do now...  
    
    Or even a (small) section on dimensions among the introductory topics. 
    
    -----Oorspronkelijk bericht-----
    Van: Little, Chris [mailto:chris.little@metoffice.gov.uk] 
    Verzonden: donderdag 2 februari 2017 18:02
    Aan: Andrea Perego
    CC: W3C SDW WG - Public
    Onderwerp: W3C RDF Data Cube dimensions definition
    
    Andrea,
    
    The formal normative document at https://www.w3.org/TR/vocab-data-cube/ does not seem to actually define 'dimensions' or 'multi-dimensional', other than as distinguishing aspects of a 'hypercube' of 'observations'.
    
    Neither does the underlying SDMX ISO statistical document at https://sdmx.org/?page_id=5008 :
     "These metadata values and  concepts  can  be  understood  as  the  named  dimensions  of  a  multi-dimensional co-ordinate system, describing what is often called a "cube" of data."
    
    
    Section 5.2 is the most informative, but not normative:
    
    "A statistical data set comprises a collection of observations made at some points across some logical space. The collection can be characterized by a set of dimensions that define what the observation applies to (e.g. time, area, gender) along with metadata describing what has been measured (e.g. economic activity, population), how it was measured and how the observations are expressed (e.g. units, multipliers, status). We can think of the statistical data set as a multi-dimensional space, or hyper-cube, indexed by those dimensions. This space is commonly referred to as a cube for short; though the name shouldn't be taken literally, it is not meant to imply that there are exactly three dimensions (there can be more or fewer) nor that all the dimensions are somehow similar in size.
    
    A cube is organized according to a set of dimensions, attributes and measures. ...
    
    ... The dimension components serve to identify the observations. A set of values for all the dimension components is sufficient to identify a single observation. Examples of dimensions include the time to which the observation applies, or a geographic region which the observation covers......"
    
    
    Section 5.3 gives us the freedom to specialise:
    
    ".... In statistical applications it is common to work with slices in which a single dimension is left unspecified. In particular, to refer to such slices in which the single free dimension is time as Time Series and to refer slices along non-time dimensions as Sections. Within the Data Cube vocabulary we allow arbitrary dimensionality slices and do not give different names to particular types of slice. Such sub-classes of slice could be added in extension vocabularies."
    
    
    Section 6 allows us to optionally order dimensions.
    
    
    Section 6.1 The Data Cube vocabulary represents the dimensions, attributes and measures as RDF properties.
    
    The Integrity Check Sections IC-4/5 states that dimensions have range, and every dimension with a range of 'concept' has a code list.
    
    
    And what is interesting, 'dimension' is sub-classed into:
    1. Dimension
    2. measureDimension
    3. timeDimension
    
    A measureDimension indicates variables or types of data of interest, so our CRSs are associated with Dimension, with the proviso that this includes proper temporal CRSs. I posit that Calendars are NOT CRSs, but complex entities in their own right and fit into the timeDimension, as this has examples of dateTime, etc. 
    
    I.e. the Dimension has a specialisation with the ideas of a single axis, one origin, +ve and -ve directions, and one Unit of Measure.  
    
    The timeDimension 'calendars have complicated cycles of durations, counts  and units.
    
    Sending this though I have not finished thinking about it or delving yet.
    
    Do we  need to have a normative definition of 'dimension'?
    
    Chris
    
    Chris Little
    Co-Chair, OGC Meteorology & Oceanography Domain Working Group
    
    IT Fellow - Operational Infrastructures
    Met Office  FitzRoy Road  Exeter  Devon  EX1 3PB  United Kingdom
    Tel: +44(0)1392 886278  Fax: +44(0)1392 885681  Mobile: +44(0)7753 880514
    E-mail: chris.little@metoffice.gov.uk  http://www.metoffice.gov.uk

    
    I am normally at work Tuesday, Wednesday and Thursday each week
    
    
    
    

Received on Friday, 3 February 2017 09:06:22 UTC