W3C home > Mailing lists > Public > public-dxwg-wg@w3.org > October 2018

Re: [dxwg] Distributions, services and implementation-resources

From: Simon Cox via GitHub <sysbot+gh@w3.org>
Date: Wed, 24 Oct 2018 22:36:49 +0000
To: public-dxwg-wg@w3.org
Message-ID: <issue_comment.created-432851965-1540420608-sysbot+gh@w3.org>
The matter of 'information equivalence' of 'Distributions' has come up in a couple of conversations I'm having: 
- in remote sensing, each spectral band of an image is typically distributed in a separate file. Each has the same spatio-temporal footprint, but can be used independently - e.g. see https://github.com/radiantearth/stac-spec/blob/master/item-spec/examples/landsat8-sample.json 
- in long-running missions, such as in astronomy, data may be distributed in a sequence of files representing specific time-slices, e.g. one per month or quarter - e.g. see https://data.csiro.au/dap/search?q=%22Parkes%20observations%22&p=2 
- social scientists frequently break up a dataset into pieces for distribution, and may also deliver degraded (anonymized) distributions with more liberal licenses than the full data. 

In DCAT I believe we encourage the view that the description of the `Dataset` captures all the semantics, and the description of the `Distribution` is merely serialization mechanics. All of these cases could be accommodated by that view, with (for example) each band of an image conceived as a distinct Dataset (as long as we provide a robust mechanism for relating datasets to each other). But this extra level of indirection complicates the mapping to actual running dataset catalogues, and would probably not be acceptable. 

I wonder if we need to take another look at this 'information equivalence' argument with these use-cases in mind. 

It might be accommodated by introducing an alternative predicate to relate a Distribution to its Dataset - e.g. alongside
- `dcat:distribution` - the object of which is intended to be informationally complete and equivalent to any siblings also linked with the same predicate, maybe also have
- `dcat:componentDistribution` - the object of which is explicitly not information complete, and is complemented by other Distributions

The latter would also require some semantic information on the distribution to describe which aspect (e.g. spectral-band, time-slice, spatial-tile, dimension) of the full Dataset is included in the particular Distribution. 

GitHub Notification of comment by dr-shorthair
Please view or discuss this issue at https://github.com/w3c/dxwg/issues/411#issuecomment-432851965 using your GitHub account
Received on Wednesday, 24 October 2018 22:36:51 UTC

This archive was generated by hypermail 2.4.0 : Thursday, 24 March 2022 20:28:25 UTC