Re: Coverage subgroup - document for discussion from Bill Roberts on 2016-05-18 (public-sdw-wg@w3.org from May 2016)

From: Bill Roberts <bill@swirrl.com>
Date: Wed, 18 May 2016 15:52:04 +0100
To: "Little, Chris" <chris.little@metoffice.gov.uk>
Cc: Rob Atkinson <rob@metalinkage.com.au>, Jon Blower <j.d.blower@reading.ac.uk>, "public-sdw-wg@w3.org" <public-sdw-wg@w3.org>, Roger Brackin <roger.brackin@envitia.com>
Message-ID: <CAMTVsu=mPJfT6axM5xkhB_xSzy005Z38RpjLsCLELBQxYvnZ6w@mail.gmail.com>
Hi Chris

The idea of tiling or bounding boxes of data cubes is a very interesting
one. As you know the RDF Data Cube standard was heavily based on work in
the statistical community, SDMX in particular, and one aspect of that is
that the values of statistical dimensions often don't have a clear order,
so defining the equivalent of a bounding box is not obvious.  You could of
course define formal subsets of the values of a dimension - and indeed we
find ourselves doing that in SPARQL queries to support user interface tools
that let users pick bits of interest out of a data cube.

Many statistical datasets have a time dimension, and that at least can
generally be definitively sorted (though the intervals in the dimension
values are allowed to overlap).  Most often the spatial dimension is a list
of areas of interest - eg we work with a lot of datasets about all local
authorities in England.  Typically we order those for presentation
according to the alphabetical order of name or perhaps lexically by their
alpha-numeric code, but that's not much use for dicing.  A common 'dice'
might be: give me the population figures for women of working age for all
five districts in Oxfordshire in 2015.

Even for stats data, having some extension of the data cube to formally
define a 'dice' rather than 'slice' could be interesting, though it would
need some kind of set-based approach.

But yes, if applying the data cube approach to spatial data, where the
dimensions are x,y,z,t, then the approach to tiling/dicing becomes more
obvious and seems useful.

Cheers

Bill



On 18 May 2016 at 14:48, Little, Chris <chris.little@metoffice.gov.uk>
wrote:

> Hi Rob,
>
>
>
> This is a bit of a diversion and probably does not help finish this SDW WG
>  topic, but is a direction I want to go:
>
>
>
> The QB model of dimensions and slices stops short of what is in OGC WCS2.0
> – where any slice can be trimmed (a form of sub-setting) to a bounding box
> aligned with the dimension axes. So far, so what.
>
>
>
> I am interested in the wholesale tiling of a data cube, as a one-off
> process, to enable a wider range of sub-setting and supporting scalability
> and reuse (if each tile given a persistent enough id). This is not really
> anything new, and some would argue is only an implementation detail. I am
> still interested. The tiles may not contain just single values from a
> simple scalar data cube, but may contain point clouds, vector geometry or
> other stuff – whatever the contents of the original data cube were.
>
>
>
> There are a variety of applicable uses cases, such as archive granule
> retrieval, data dissemination to a very large number of low powered
> devices, boundary conditions for a large number of local weather prediction
> models.
>
>
>
> Whether the tiles are treated as a single multi-dimensional coverage or a
> collection of a large number of lower dimensional coverages, I do not mind,
> but it seems to me that this a simple and straightforward addition to the
> QB model.
>
>
>
> Is it?
>
>
>
> Chris
>
> *From:* Rob Atkinson [mailto:rob@metalinkage.com.au]
> *Sent:* Wednesday, May 18, 2016 1:50 PM
> *To:* Bill Roberts; Jon Blower
> *Cc:* public-sdw-wg@w3.org
> *Subject:* Re: Coverage subgroup - document for discussion
>
>
>
> Hi,
>
> I've put some detail on the page
> https://www.w3.org/2015/spatial/wiki/Data_cube_for_coverage to identity
> different possible directions for this aspect.
>
>
>
> FYI My project with OGC is concerned with UC1 and UC2, which seems
> complementary to the other activites supporting this thread.
>
>
>
> Cheers
>
> Rob Atkinson
>
>
>
>
>
> On Wed, 18 May 2016 at 20:45 Bill Roberts <bill@swirrl.com> wrote:
>
> Thanks Jon, that's a useful perspective.  Certainly we talk about making
> discovery and retrieval of the data easier, working nicely with web-based
> technology etc - so we need to be clear about 'easier for whom'.
> Inevitably different people will want different things so we will have to
> be explicit about our priorities.
>
>
>
> The existing use cases cover quite a few of the scenarios you have
> sketched out, but they don't yet link those to these kind of user
> personas.  That might be worth doing - it probably wouldn't take long.
>
>
>
>
>
>
>
>
>
>
>
> On 18 May 2016 at 11:35, Jon Blower <j.d.blower@reading.ac.uk> wrote:
>
> Hi Bill, all,
>
>
>
> Just some initial thoughts in advance of our telecon. There is lots of
> good stuff in here, and it’s all relevant to the general area of
> “Coverages”. Some of these issues are of course very complex and I don’t
> think we’ll solve them all – and in fact this group might not be the best
> place to do so.
>
>
>
> I wonder if it would help to structure the document and our thinking
> around the different audiences we might aim at. For example:
>
>
>
>  * A “web developer” might need some explanation of what a coverage is
> (“dummies’ guide”). He/she would probably like a simple API to access them,
> and some simple formats with which he/she is familiar. The applications are
> likely to be reasonable simple and visualisation-oriented, rather than
> “deep” analysis.
>
>
>
>  * A “spatial data publisher” might already be familiar with the
> terminology, but might want to know how to make his/her data more
> discoverable by mass-market search engines, or how best to make use of
> Linked Data and semantic stuff. He/she is probably going to be keen to
> describe coverage data very precisely (e.g. using the “right” CRS and
> full-res geometries), but is also interested in the cost/benefit tradeoff.
>
>
>
>  * A “data analyst/scientist” might be interested in quality and
> uncertainty, and how to bring coverage data into his/her tools (e.g. GIS,
> Python scripts). This kind of person may just want to download the data
> files in an unmodified form, although data-extraction services can be
> useful in some circumstances (and hosted processing is increasingly
> popular).
>
>
>
>  * An “environmental consultant” may have very limited time to perform
> some kind of analysis to form part of a report. If a dataset is hard to
> find, access or understand it will probably simply be omitted from the
> analysis. Often interested in a very specific geographic area. Needs to
> quickly establish that a dataset is trustworthy,
>
>
>
>  * An “IT provider” might be interested in scalable and maintainable web
> services for high-volume data that can be made part of his/her
> organisation’s operational procedures. He/she probably has a low tolerance
> for high-complexity or “bleeding edge” technology.
>
>
>
> This is just off the top of my head, and there are certainly more, and
> there will also be lots of overlap. And I’m sure there’s lots to argue
> about there. But this helps me, at least, put some structure on the Big
> List. For each of these kinds of user, what would be the most useful thing
> that we could do to help them (maybe a new technology, or a recommendation
> to use something existing, or an admission that the problem remains
> unsolved), in the context of this group?
>
>
>
> (Am I just reinventing the Use Cases here, or is this still useful for the
> Coverage requirements?)
>
>
>
> Cheers,
>
> Jon
>
>
>
>
>
>
>
> *From: *Bill Roberts <bill@swirrl.com>
> *Date: *Tuesday, 17 May 2016 23:44
> *To: *"public-sdw-wg@w3.org" <public-sdw-wg@w3.org>
> *Subject: *Coverage subgroup - document for discussion
> *Resent-From: *<public-sdw-wg@w3.org>
> *Resent-Date: *Tuesday, 17 May 2016 23:44
>
>
>
> Hi all
>
>
>
> I've made some initial notes on requirements in this wiki page:
>
>
>
> https://www.w3.org/2015/spatial/wiki/Coverage_draft_requirements
>
>
>
> I'd like to go through this on the call tomorrow (we probably won't get
> all the way through it as there is quite a lot there).  If you are joining
> the call it would be great if you could look at it in advance.
>
>
>
> Comments also welcome via this mailing list.
>
>
>
> Cheers
>
>
>
> Bill
>
>
>
>
>
>
>
>
Received on Wednesday, 18 May 2016 14:52:34 UTC