Re: [Data Cubes] Why this kind of Data Structure Definition from Dave Reynolds on 2012-08-15 (public-gld-comments@w3.org from August 2012)

From: Dave Reynolds <dave.e.reynolds@gmail.com>
Date: Wed, 15 Aug 2012 11:43:02 +0100
To: public-gld-comments@w3.org
Message-ID: <502B7D36.2050605@gmail.com>
On 15/08/12 10:41, Thomas Bandholtz wrote:
> Thanks, Dave,
>
> SDMX compliance is not required in my use case, but I see the point.
> The Environment domain is listed but not yet supported by SDMX, and SDMX
> is not known in this domain.
>
> I would express as much as possible in RDFS/OWL, even if this is redundant.

If you don't mind redundancy then there's nothing to stop you adding 
those RDFS/OWL constraints in addition.

> I do not see any necessary need to know SDMX if you have a data cube, so
> I would make any SDMX compliance optional for simple cases - which
> somehow leeds back to SCOVO.

No, no. There's no requirement to know SDMX or enforce SDMX compliance 
to use Data Cube. The DSD design stands by itself and seems to work well 
for people. Especially for reasons #2 and #3 in my explanation.

> Am 15.08.2012 11:00, schrieb Dave Reynolds:
>>> RDF can describe ordered lists,
>>
>> Yes and the initial design used that but it proved problematic.
>>
>> At one point it was proposed that the DSD should be a list of
>> ComponentProperties so that the ordering was clear.
>>
>> The problem with that query of RDF lists is tricky (though the advent
>> of SPARQL 1.1 has alleviated that somewhat).  This was especially
>> annoying because the majority of cubes don't specify an order so
>> complicating access in the general case to cater for minority cases
>> was distasteful.
>
> I have drafted some non-ordered solution which allows multiple different
> UoM for multiple measures  on the dataset level.
>
> myDef:dataset a qb:Dataset ;
>      myCubes:hasUoM
>        [myCubes:property myDef:year; myCubes:uom x:years],
>        [myCubes:property myDef:age; myCubes:uom x:years],
>        [myCubes:property myDef:hatSize; myCubes:uom x:cm],
>        [myCubes:property myDef:moneyInPocket; myCubes:uom x:euro];

Not sure how this relates to the above conversation.

That seems to just express the UoM not a DSD.

For the UoM for multi-measure cubes you could use an attribute on the 
ComponentProperties as I mentioned before.

> myDef:MixedObservation rdfs:subClassOf myCubes:Observation
>     rdfs:domain :myDef:region, myDef:year , myDef:age, myDef:hatSize .
>        myDef:moneyInPocket.

Sorry, can't follow that at all.

Did you mean things like:
     myDef:year    rdfs:domain myDef:MixedObservation .
     myDef:age     rdfs:domain myDef:MixedObservation .
     myDef:hatSize rdfs:domain myDef:MixedObservation .
?

If so then that kind of works but now you can't use myDef:year etc on 
another cube of a different shape. For many of our use cases we want to 
reuse dimension and measure properties across cubes.

So to use this approach you would need OWL so as to make the association 
specific to the MixedObservation class. For example:

   myDef:MixedObservation a owl:Class ;
     rdfs:subClassOf qb:Observation ;
     rdfs:subClassOf [
       a owl:Restriction;
       owl:onProperty myDef:year;
       owl:cardinality 1 ;
     ];
     etc

Perfectly fine your you to do that in addition to a DSD. But my previous 
remarks still stand as to why that's not a necessary or sufficient 
alternative to a DSD.

> Then proceed like in Data Cubes or Scovo.
> :o1 a myDef:MixedObservation ;
>      myCubes:dataset myDef:dataset ;
>      myDef:region x:spain ;
>      myDef:year 2011 ;
>      myDef:age 50 ;
>      myDef:hatSize 52 ;
>      myDef:moneyInPocket 0.01 .
>
> I guess this again is not SDMX compliant.

?? Looks just like a Data Cube observation to me.

> Basically I propose some very few extensions (prefixed with myCubes:)
>
> myCubes:UomOfProperty a owl:Class ;
>      rdfs:domain myCubes:property ;
>      rdfs:domain myCubes:uom .
>
> myCubes:property rdfs:range qb:ComponentProperty .
> myCubes:uom rdfs:range qudt:Unit . # may be SDMX as well, this doesn't
> matter here
>
> qb:Dataset rdfs:domain myCubes:hasUoM .

Presumably you meant:
   myCubes:hasUoM rdfs:domain qb:DataSet .

> myCubes:hasUoM rdfs:range myCubes:UomOfProperty .

Seems unnecessary to have an entire new structure just for declaring UoMs.

If the attribute mechanism is proving cumbersome for you then the thing 
to consider would be to add a qb2:uom property to 
qb:ComponentSpecification so that you can define the UoM as part of the DSD:

myDef:dataset a qb:DataSet ;
     qb:structure [ a qb:DataStructureDefinition;
       qb:component [qb:dimension myDef:year;    qb2:uom x:years] ;
       qb:component [qb:dimension myDef:region;  qb2:uom x:place] ;
       qb:component [qb:measure   myDef:hatSize; qb2:uom x:cm] ;
        ...
      ];

Though I'm not yet convinced that is preferable to attaching the UOM 
directly to the component property.

Note that dimensions rarely have UoM, dimension values are typically 
coded as instances of some class or concept scheme and so the 
interpretation of their values is given by their rdfs:range.

Attaching UoM to non-coded measures by putting an attribute like 
sdmx-attribute:unitMeasure on the MeasureProperty seems simple and 
nicely symmetric with use of rdfs:range.

Dave
Received on Wednesday, 15 August 2012 10:43:31 UTC