W3C home > Mailing lists > Public > public-gld-wg@w3.org > April 2013

AW: ISSUE-59 (FC-HierarchicalCodeList): Last Call comment. Frank Cotton on qb:HierarchicalCodeList [Data Cube Vocabulary]

From: Benedikt Kaempgen <kaempgen@fzi.de>
Date: Thu, 25 Apr 2013 14:07:41 +0000
To: Dave Reynolds <dave.e.reynolds@gmail.com>, "public-gld-wg@w3.org" <public-gld-wg@w3.org>
Message-ID: <0D7BFFD7C415144DA75C3D49C46AC21512AC4AA8@ex-ms-1a.fzi.de>
Hello,

Regarding this issue I would like to point to a discussion about ISSUE-31 we had some time ago [1]. 

Maybe that gives additional arguments to be put in the answer, e.g., that "the publisher of the geography and the publisher of the statistics themselves are often (at least in our case) different".

Best,

Benedikt 

[1] ----------------------------
________________________________________
Von: Dave Reynolds [dave.e.reynolds@gmail.com]
Gesendet: Donnerstag, 17. Januar 2013 15:34
An: Benedikt Kaempgen
Cc: public-gld-wg@w3.org
Betreff: Re: AW: ISSUE-31 (Aggregation): Supporting aggregation for other than SKOS hierarchies [Data Cube Vocabulary]

Hi Benedikt,

On 11/01/13 13:04, Benedikt Kaempgen wrote:
> Hello,
>
>> For Data Cube, of course, the issue is not whether those relationships
>> can be represented (they can already [1]) but whether Data Cube
>> dimensions can reference and reuse those representations directly.
>
> I am wondering why it would not be sufficient to recommend publishers to add skos:narrower and skos:broader relationships - if semantically applicable - to the domain specific properties. I see your point that with time and geo there might be many possible values to the dimensionProperties; however, in general, the number of possible values of a dimensionProperty (especially as used by a certain dataset) is small in comparison to the number of observations. Thus, I would not see a problem in adding the needed triples to the same file as the data structure definition.

There are at least two issues with that.

First, the publisher of the geography and the publisher of the
statistics themselves are often (at least in our case) different. So
requiring, for example, a local authority to publish a set of assertions
about the Ordnance Survey's administrative geography URIs just in order
to publish their own local statistics would be ... problematic.

Secondly, it's semantically dubious. The correct hierarchical
relationship for statistical geographies in this case is containment
which is different from skos:narrower/broader. This was already
discussed up-thread.

> Is there a concrete use case that would require direct usage by QB?

Several authorities in the UK including the Department for Communities
and Local Government and the Welsh Assembly Government have published
"index of deprivation" data using QB and using geographic linked data
from the Ordnance Survey. In each case the people publishing the data
(not us in either of those cases) raised the question of how they could
use the published geographic containment information as the hierarchy
definition for the cubes.

[This is the same concrete use case already recorded in the issue.]

Dave

> In any case, I have added a requirement "There should be a recommended mechanism to support non-SKOS hierarchical code lists" to [1].
>
> Best,
>
> Benedikt
>
> [1] <http://www.w3.org/2011/gld/wiki/Data_Cube_Vocabulary/Use_Cases#There_should_be_a_recommended_mechanism_to_support_non-SKOS_hierarchical_code_lists>
>
> ________________________________________
> Von: Dave Reynolds [dave.e.reynolds@gmail.com]
> Gesendet: Samstag, 18. Februar 2012 13:34
> An: public-gld-wg@w3.org
> Betreff: Re: ISSUE-31 (Aggregation): Supporting aggregation for other than  SKOS hierarchies [Data Cube Vocabulary]
>
> Hi Dan,
>
> Fully agree that there are distinct semantic relationships.
>
> For Data Cube, of course, the issue is not whether those relationships
> can be represented (they can already [1]) but whether Data Cube
> dimensions can reference and reuse those representations directly.
>
> As you might expect my proposed answer is that indeed we should be able
> to use such existing representations directly, for example by annotating
> a Dimension with information on the properties used to represent the
> hierarchy.
>
> Dave
>
> [1] For sub-type/super-type we have rdfs:subClassof. For meronomies then
> there are a number of both generic and domain-specific ontologies for
> representing part-whole relations. Furthermore OWL2 gives power (for
> those who need it) to axiomatize part-whole relations.
>
> On 17/02/12 18:48, Gillman, Daniel - BLS wrote:
>> Dave,
>>
>> I have a short comment on this one for now.  Relying on the SKOS relations of broader / narrower to express hierarchies is semantically limited.  There are 2 main kinds of hierarchical relations: generic and partitive.  Both should be available for use.  The broader / narrower constructs in SKOS subsume both, yet they are quite different.
>>
>> Generic refers to the super-type / sub-type relation.  An example is the relationship between vehicle and automobile.  An automobile is a vehicle, but not every vehicle is an automobile.
>>
>> Partitive refers to the part-whole relation.  An example is the relationship between an automobile and the engine.  The engine is a part of an automobile, but it is not an automobile itself.
>>
>> The ISO standards 704 and 1087-1 describe this very well.  SKOS references ISO 2788, but that standard is imprecise in this matter.
>>
>> You mentioned SDMX in the statistical community.  The other important metadata standard in use there is the DDI (Data Documentation Initiative), and an effort out of DDI work is to extend SKOS to include a richer semantics as I described above.  This is being called XKOS.  Richard is involved.
>>
>> Yours,
>> Dan
>>
>>
>> Dan Gillman
>> Bureau of Labor Statistics
>> Office of Survey Methods Research
>> 2 Massachusetts Ave, NE
>> Washington, DC 20212 USA
>> Tel     +1.202.691.7523
>> FAX    +1.202.691.7426
>> Email  Gillman.Daniel@BLS.Gov
>> -----------------------------------------
>> "Whatever it is, I'm against it!
>> No matter what it is or who commenced it,
>> I'm against it!"
>> ~ Groucho Marx
>> ------------------------------------------
>>
>>
>>
>> -----Original Message-----
>> From: Government Linked Data Working Group
>> Issue Tracker [mailto:sysbot+tracker@w3.org]
>> Sent: Friday, February 17, 2012 11:14 AM
>> To: public-gld-wg@w3.org
>> Subject: ISSUE-31 (Aggregation): Supporting aggregation for other than SKOS hierarchies [Data Cube Vocabulary]
>>
>> ISSUE-31 (Aggregation): Supporting aggregation for other than SKOS hierarchies [Data Cube Vocabulary]
>>
>> http://www.w3.org/2011/gld/track/issues/31
>>
>> Raised by: Dave Reynolds
>> On product: Data Cube Vocabulary
>>
>> Data Cube, like SDMX, supports a notion of hierarchical dimensions.
>>
>> For example, a data set on population might be broken down by sex. If the code list is hierarchical (with an ex:ALL top category with subcategories of ex:Male and ex:Female) then it is possible to publish a dataset with the overall population corresponding to the top level code (ex:ALL) and then the number of men and women coded against the sub-categories. In the current design skos:Concepts are used for such code list and thus skos:broader/skos:narrower used to define the hierarchy.
>>
>> Do we need to generalize or clarify this mechanism?
>>
>> Specifically:
>>
>> (1) Should Data Cube allow non-SKOS hierarchical code lists? E.g. by declaring the property which links codes into a subsumption hierarchy.
>>
>> A specific use case for this is for geographic information. Many statistical data sets published by governments include dimensions of time (sdmx:refTime) and area (sdmx:refArea). For representing geographic or administrative regions there are large existing linked data sets (such as http://data.ordnancesurvey.co.uk/.html) where the spatial containment relation has already been defined and is not skos:narrower. Similarly for representing time periods like Quarters, Half (Government) years etc there are services such as the UK reference time service (http://www.epimorphics.com/web/wiki/using-interval-set-uris-statistical-data). Can those hierarchies be directly used for Data Cube dimensions?
>>
>> See also discussion thread at [1].
>>
>> (2) Should it be possible to explicitly provide an aggregate value at the level of a qb:Slice?
>>
>> A specific use case for this publication of financial data. Some UK local authorities have published payments information using a Payments ontology[2] which derives Data Cube. Individual expenditure line times appear as single qb:Observations, these are then grouped into qb:Slices which make up a single payment and the total payment is given at the slice level. This may be a common pattern which, if support directly by Data Cube, would allow for publication of aggregates which cross multiple dimensions.
>>
>> The possible ways of expressing such aggregation relations is related to ISSUE-30.
>>
>> [1] http://groups.google.com/group/publishing-statistical-data/browse_thread/thread/dc8d7e231d47935e/b3fd023d8c33561d?#b3fd023d8c33561d
>>
>> [2] http://data.gov.uk/resources/payments
>>
>>
>>
>
>
---------------------------------

________________________________________
Von: Dave Reynolds [dave.e.reynolds@gmail.com]
Gesendet: Mittwoch, 10. April 2013 15:43
An: public-gld-wg@w3.org
Betreff: Re: ISSUE-59 (FC-HierarchicalCodeList): Last Call comment. Frank  Cotton on qb:HierarchicalCodeList [Data Cube Vocabulary]

On 10/04/13 13:49, Government Linked Data Working Group Issue Tracker wrote:
> ISSUE-59 (FC-HierarchicalCodeList): Last Call comment. Frank Cotton on qb:HierarchicalCodeList [Data Cube Vocabulary]
>
> http://www.w3.org/2011/gld/track/issues/59
>
> Raised by: Dave Reynolds
> On product: Data Cube Vocabulary
>
> Last Call comment from Frank Cotton: http://lists.w3.org/Archives/Public/public-gld-comments/2013Apr/0061.html
>
> [[[
> On the whole, I am very hesitant about the introduction of the qb:HierarchicalCodeList class and associated properties. This raises the more general problem of how to derive SKOS concept schemes from sets of resources that have some kind of "real world" hierarchical relations that are not hierarchies between codes in a code list, but hierarchies in some specific sense between objects. The example of geographic territories is a good one: here you have the territorial inclusion relation that induces a broader/narrower relation between associated items in a code list, but of course we do not want to make a confusion between a region, for example, and an item in a code scheme, nor between territorial inclusion and "broader concept". It seems to me that the approach you describe fuels a bit the confusion.
>
> A different approach would be to explicitely generate a SKOS concept scheme parallel to the "real world" hierarchy and to use a property like foaf:focus to link the concepts to the things they represent (see http://lists.w3.org/Archives/Public/public-esw-thes/2010Aug/0002.html).
>
> I think that this is a very general problem that should be addressed in an ad hoc W3C or other group aimed at developing a recommanded practice, rather than treated here and there in different fashions.
> ]]]

I propose that we push back in this case.

We failed to mark this item "At Risk" (arguably an oversight) so
withdrawing it would mean another Last Call which would probably mean
not getting to CR before we close.

We had one comment [1] specifically endorsing this feature so removing
it would also risk Last Call comments the other way round.

Possible sketch response text:

[[[
Thank you for your comments on qb:HierarchicalCodeList, we understand
your hesitation.

This was added in response to feedback from groups who have tried to use
Data Cube and found a requirement for something of this nature in order
to be able to use existing geographic or admin-geographic hierarchies as
dimensional values. See for example Last Comment [i] endorsing this feature.

We're not sure that adding this option necessarily fuels confusion
between skos:broader and say admingeo:containedBy. One could equally
argue that having to create a parallel skos hierarchy in which quite
different notions like geographic containment and administrative
containment are all translated to skos:broader would itself fuel such
confusion.

In any case, in the reported use cases minting a parallel skos hierarchy
is not practical. In technical terms it is possible but there is a
challenge for how to keep it up to date as the primary geographic
hierarchy is changed (in the absence of standardized change notification
for linked data resources). In social/political terms it is much harder.
There is a strong preference to use officially supported identifiers for
administrative areas when publishing information pertaining to those areas.

Our preference would be to retain the qb:HierarchicalCodeList in Data
Cube at this time since it gives users who have expressed this
requirement a usable option. We would qualify the text to clarify that
this construct should not be in cases where a suitable SKOS concept
scheme exists or could could reasonably be created.

Would this be acceptable to you?

Your suggestion of ad hoc W3C, or other, group work to develop best
practice in this area is a good one. Such a group could include in their
advice recommendations on when and how qb:HierarchicalCodeList is best
used. Which may include recommendations that it should be deprecated in
favour of a broader solution when one is available.

Dave

[i]
http://lists.w3.org/Archives/Public/public-gld-comments/2013Apr/0021.html
]]]

One other thought is whether we can add something similar to "At risk"
statements between LC and CR?  I.e. could we now mark it this section as
"May be deprecated, seeking further feedback from implementers"? Is
there such a notion?

Dave

[1]
http://lists.w3.org/Archives/Public/public-gld-comments/2013Apr/0021.html



Received on Thursday, 25 April 2013 14:08:06 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 25 June 2013 15:04:59 UTC