Re: [QB] ISSUE-31 (Aggregation hierarchies) Discussion and proposal from Dave Reynolds on 2013-03-03 (public-gld-wg@w3.org from March 2013)

From: Dave Reynolds <dave.e.reynolds@gmail.com>
Date: Sun, 03 Mar 2013 21:52:49 +0000
To: Richard Cyganiak <richard@cyganiak.de>
CC: Government Linked Data Working Group <public-gld-wg@w3.org>
Message-ID: <5133C631.6080102@gmail.com>
On 03/03/13 20:43, Richard Cyganiak wrote:
> Dave,
>
> I can live with the general approach. Comments inline.
>
> On 28 Feb 2013, at 11:30, Dave Reynolds wrote:
>> PROPOSAL.  Proposed approach is a vocabulary extension:
>>
>> qb:hierarchy (domain: qb:CodedProperty, range: qb:Hierarchy)
>>    Indicates a specification of the hierarchy used for coding this property (typically a DimensionProperty). Where a skos:ConceptScheme exists with appropriate broader/narrower relations then that should be used and should be specified using qb:codeList. The qb:hierarchy declaration is only need for situations where a suitable skos:ConceptScheme is not available.
>
> I understand that this is no longer part of the proposal, but instead qb:codeList is generalized to allow Hierarchies in the range? That would seem better to me.

Yes, that's the way I'm doing it now.

>> qb:Hierarchy (owl:Class)
>>    Specifies a hierarchy which can be used for coding. The same concepts may be members of multiple hierarchies provided that different qb:[narrowing/broadening]Property values are using for each hierarchy.
>
> Ok. Maybe call it qb:HierarchicalCodeList? That might draw attention to the similarities between the two kinds of hierachies -- the flexible qb:Hierarch(y|icalCodeList), and the more specific skos:ConceptScheme. (Not a strong opinion.)

OK.

>> qb:AggregatableHierarchy (sub class of: qb:Hierarchy)
>>     Indicates a hierarchy in which each parent concept is a disjoint union of its child concepts. So that measures such as simple counts *may* be aggregated up the hierarchy.
>
> I don't quite see how this can work. If I know that each parent is the disjoint union of its children, I still don't know how to aggregate values. If the observations measure life expectancy, I need to average. If the observations measure population count, I need to sum. In other cases, I may have to take the minimum or maximum. It seems to me that this class only addresses one particular use case and is not a general solution to the problem of aggregating up the hierarchy.

The class can only define properties of the hierarchy, that's what it's 
about. You indeed need additional information about measures to know 
that a measure itself can be aggregated. The full solution would require 
our postponed ISSUE-30.

However, this half of the solution is already useful on its own and 
there are cases (e.g. when the unit of measure is "count") where 
aggregation is possible without additional knowledge.

> I propose to drop qb:AggregatableHierarchy. It can be easily defined in a use case specific extension.

I would certainly prefer to be able to state this property of a 
hierarchy and it doesn't seem like its presence should be problematic.

However, given the current timescale if this would hold up approval to 
move to Last Call then I'll withdraw it, with regret.

>> qb:hierarchyRoot (domain: qb:Hierarchy, range: skos:Concept)
>>    Specifies a root of the hierarchy. A hierarchy may have multiple roots but must have at least one.[7]
>
> Fine. (Is there a general assumption that the members of a hierarchy still must be skos:Concepts? I think we don't need to make that assumption. Not making the assumption may avoid some confusion and may be less controversial. In that case, the range would simply be rdfs:Resource I think.)

OK, my current draft indeed has range skos:Concept because anything can 
be a skos:Concept. Happy to drop that and make the range rdfs:Resource.

>> qb:narrowingProperty (domain: qb:Hierarchy, range: rdf:Property)
>>    Specifies a property which relates a parent concept in the hierarchy to a child concept. One of qb:narrowingProperty or qb:broadeningProperty must be given but it is not necessary to have both. Note that a child may have more than one parent.
>>
>> qb:broadeningProperty (domain: qb:Hierarchy, range: rdf:Property)
>>    Specifies a property which relates a child concept in the hierarchy to a parent concept. One of qb:narrowingProperty or qb:broadeningProperty must be given but it is not necessary to have both. Note that a child may have more than one parent.


> One of these is redundant. I'd do away with broadeningProperty, rename narrowingProperty to something like parentChildProperty (seems more intuitive to me), and point out the design pattern of asserting
>
>   ex:myHierarchy qb:parentChildProperty [ owl:inverseOf ex:parent ].

Hmm. That is certainly possible, not sure I'm convinced that's clearer. 
There are certainly cases (e.g. OS data) were both are available and I 
liked the ability to specify both of them.

However, I guess I'm prepared to go along with this.

> I presume the well-formedness rule for hierarchies are something like:
>
> 1. If a dimension has a hierarchy as code list, then values of that dimension in observations must be reachable from one of the roots in zero or more hops along the parentChildProperty, except if the parentChildProperty is a blank node.
>
> 2. If the parentChildProperty has an inverse property P, then any dimension value must be reachable from one of the roots in zero or more inverse hops along P.

Seems reasonable.

Dave
Received on Sunday, 3 March 2013 21:53:19 UTC