AW: ISSUE-31 (Aggregation): Supporting aggregation for other than SKOS hierarchies [Data Cube Vocabulary] from Benedikt Kaempgen on 2013-01-11 (public-gld-wg@w3.org from January 2013)

From: Benedikt Kaempgen <kaempgen@fzi.de>
Date: Fri, 11 Jan 2013 13:14:43 +0000
To: Government Linked Data Working Group <public-gld-wg@w3.org>
Message-ID: <0D7BFFD7C415144DA75C3D49C46AC2150B625633@ex-ms-1a.fzi.de>
Hi,

> (2) Should it be possible to explicitly provide an aggregate value at the level of a qb:Slice?
The payment ontology gives a concrete example of this case [1]:

It recommends to add the aggregated value directly to a slice:

<slice-1234> a qb:Slice, eg:PaymentByDeptPeriod;
    qb:sliceStructure   eg:paymentByDeptPeriod;
    payment:payee  eg:MyOrganization;
    payment:date   <http://reference.data.gov.uk/id/month/1>;
    payment:unit   eg:MySubunit;
    payment:totalNetAmount  '456789.00'^^xsd:decimal;
    qb:observation <expenditure-1>, <expenditure-2>, <expenditure-3>, ... .

I am wondering why it would not be sufficient to define a new value eg:allPayers to payment:payer and to add an observation that would aggregate on this level:

<observation-1234> a qb:Observation;
    qb:structure   eg:PaymentsOnlyStructure;
    payment:payee  eg:MyOrganization;
    payment:date   <http://reference.data.gov.uk/id/month/1>;
    payment:unit   eg:MySubunit;
    payment:payer  eg:allPayers;
    payment:totalNetAmount  '456789.00'^^xsd:decimal;

The relationship between eg:allPayers and single payers could be represented using skos.

This way, the complexity to analyse statistical data would be reduced to comparing observations.

Of course, I have added another requirement ;-) to [2]: "There should be a recommended mechanism to allow for publication of aggregates which cross multiple dimensions".

Best,

Benedikt

[1] <http://data.gov.uk/resources/payments#variations>
[2] <http://www.w3.org/2011/gld/wiki/Data_Cube_Vocabulary/Use_Cases#There_should_be_a_recommended_mechanism_to_allow_for_publication_of_aggregates_which_cross_multiple_dimensions>

________________________________________
Von: Government Linked Data Working Group Issue Tracker [sysbot+tracker@w3.org]
Gesendet: Freitag, 17. Februar 2012 17:14
An: public-gld-wg@w3.org
Betreff: ISSUE-31 (Aggregation): Supporting aggregation for other than SKOS hierarchies [Data Cube Vocabulary]

ISSUE-31 (Aggregation): Supporting aggregation for other than SKOS hierarchies [Data Cube Vocabulary]

http://www.w3.org/2011/gld/track/issues/31

Raised by: Dave Reynolds
On product: Data Cube Vocabulary

Data Cube, like SDMX, supports a notion of hierarchical dimensions.

For example, a data set on population might be broken down by sex. If the code list is hierarchical (with an ex:ALL top category with subcategories of ex:Male and ex:Female) then it is possible to publish a dataset with the overall population corresponding to the top level code (ex:ALL) and then the number of men and women coded against the sub-categories. In the current design skos:Concepts are used for such code list and thus skos:broader/skos:narrower used to define the hierarchy.

Do we need to generalize or clarify this mechanism?

Specifically:

(1) Should Data Cube allow non-SKOS hierarchical code lists? E.g. by declaring the property which links codes into a subsumption hierarchy.

A specific use case for this is for geographic information. Many statistical data sets published by governments include dimensions of time (sdmx:refTime) and area (sdmx:refArea). For representing geographic or administrative regions there are large existing linked data sets (such as http://data.ordnancesurvey.co.uk/.html) where the spatial containment relation has already been defined and is not skos:narrower. Similarly for representing time periods like Quarters, Half (Government) years etc there are services such as the UK reference time service (http://www.epimorphics.com/web/wiki/using-interval-set-uris-statistical-data). Can those hierarchies be directly used for Data Cube dimensions?

See also discussion thread at [1].

(2) Should it be possible to explicitly provide an aggregate value at the level of a qb:Slice?

A specific use case for this publication of financial data. Some UK local authorities have published payments information using a Payments ontology[2] which derives Data Cube. Individual expenditure line times appear as single qb:Observations, these are then grouped into qb:Slices which make up a single payment and the total payment is given at the slice level. This may be a common pattern which, if support directly by Data Cube, would allow for publication of aggregates which cross multiple dimensions.

The possible ways of expressing such aggregation relations is related to ISSUE-30.

[1] http://groups.google.com/group/publishing-statistical-data/browse_thread/thread/dc8d7e231d47935e/b3fd023d8c33561d?#b3fd023d8c33561d

[2] http://data.gov.uk/resources/payments
Received on Friday, 11 January 2013 13:15:08 UTC