Re: [QB] ISSUE-33 Discussion and possible resolutions from Richard Cyganiak on 2013-03-02 (public-gld-wg@w3.org from March 2013)

From: Richard Cyganiak <richard@cyganiak.de>
Date: Sat, 2 Mar 2013 19:02:44 +0000
To: Dave Reynolds <dave.e.reynolds@gmail.com>
Cc: Government Linked Data Working Group <public-gld-wg@w3.org>
Message-Id: <DDF3B66F-A844-47BE-B046-B097D7990425@cyganiak.de>
On 27 Feb 2013, at 18:02, Dave Reynolds wrote:
> My preference is for 2, though I have a conflict of interest here.[4]
> 
> If that is a flexibility too far then I would prefer 1.c or 1.b so that existing deployments that use qb:observation this way can continue to at least use that predicate in this way.

I am not in favour of 2. The reason being that the term “slice” strongly implies a regular structure, and using that term for a construct whose intended uses include arbitrary collections of observations seems not undesirable to me.

Proposal: Define qb:ObservationGroup, make qb:Slice a subclass of that, define domain of qb:observation to be ObservationGroup. (“Collection” is too generic a term. “ObservationCollection” would work but is a bit long.)

My preference would then be to leave observation groups completely unconstrained, so that even observations from different datasets could be grouped. The reason being that then we don't have to even start thinking about attachment levels or well-formedness constraints for groups :-) Perhaps state that users should subclass qb:ObservationGroup if they want to use it.

Best,
Richard




On 27 Feb 2013, at 18:02, Dave Reynolds wrote:
> ISSUE-33 [1] is about generalizing qb:slice.
> 
> As it says in the current spec:
> 
> """Slices allow us to group subsets of observations together. This not intended to represent arbitrary selections from the observations but uniform slices through the cube in which one or more of the dimension values are fixed."""
> 
> Slices confer three benefits:
> 1. guides consuming applications in how to present the data
> 2. provides an identifier (e.g. for external annotation)
> 3. allows for a less bulk, abbreviated, format
> 
> Thus normal usage for qb:slice is that each slice is associated with a qb:SliceKey which in turn lists those dimensions that are fixed in the corresponding qb:slices. The vocabulary does not currently include any formal OWL restrictions that require this but the spec should formalize it one way or another as part of addressing ISSUE-29.
> 
> When using Data Cube for measurement data, such as environmental information, we have found it to be useful to also have collections of observations which don't correspond to a specific qb:SliceKey but represent some hard-to-compute view across the data which is useful for presentational purposes (benefit #1). For example, in the Environment Agency publication of Bathing Water quality information there are sets representing the latest available value for all Bathing Waters [2]. So there is one value for each location but the time dimension is an implicit "latest available" rather than an explicit specific time.[3] That data uses sub classes of qb:Slice to represent such collections and so uses qb:observation to link to the observations themselves.
> 
> So the issue here is whether we sanction and support such use cases or not.
> 
> I see N possible approaches.
> 
> 1.a Reject. A data cube is well formed only if there is a qb:SliceKey for each qb:Slice and if each observation within the qb:Slice has the same value for each fixed dimension, which can be attached to the qb:Slice in abbreviated mode.  An application is free to invent a new class to represent arbitrary collections of observations but cannot use qb:observation to link to them.
> 
> 1.b As 1.a but generalize qb:observation to be open domain so that it could be reused in such circumstances.
> 
> 1.c As 1.a but provide a qb:Collection class to use for such purposes and make the domain of qb:observation be the union of qb:Slice and qb:Collection (or open).
> 
> 2. Allow. A qb:Slice is simply a collection of qb:Observations which are grouped together to aid data consumers. If a qb:Slice has a qb:SliceKey then all observations on given slice should have the same value for every fixed dimension. However, a qb:Slice may be used to represent other collections of observations and in those cases lack a qb:SliceKey.
> 
> My preference is for 2, though I have a conflict of interest here.[4]
> 
> If that is a flexibility too far then I would prefer 1.c or 1.b so that existing deployments that use qb:observation this way can continue to at least use that predicate in this way.
> 
> Comments?
> 
> Dave
> 
> [1] http://www.w3.org/2011/gld/track/issues/33
> (oops, should have put issue links in my earlier messages, too late)
> 
> [2] http://environment.data.gov.uk/data/bathing-water-quality/in-season/slice/latest
> 
> [3] The issue is not about how non-monotonic things can represented in RDF, don't let that aspect of this example divert from the QB question.
> 
> [4] While it was not me that did that particular modelling it is my company that publishes the Environment Agency data.
>
Received on Saturday, 2 March 2013 19:03:14 UTC