Re: WG discussion: proposal to remove BP 13 - Provide subsets for large spatial datasets from Jeremy Tandy on 2017-03-02 (public-sdw-wg@w3.org from March 2017)

From: Jeremy Tandy <jeremy.tandy@gmail.com>
Date: Thu, 02 Mar 2017 12:33:26 +0000
To: Clemens Portele <portele@interactive-instruments.de>, SDW WG Public List <public-sdw-wg@w3.org>
Message-ID: <CADtUq_2XusY+wTpP0TumiXxJ_J+sj5BXxTAqGn1FJAHRbSaDRA@mail.gmail.com>

Clemens

+1 from me.

I would suggest the following changes to accommodate the removal of BP13 ...

Up in the §12.6 intro material, where you refer to DWBP's BP18, add a
comment about why subsetting spatial data is often necessary. BP13 "why"
already says:

```
Spatial datasets, particularly coverages
<http://w3c.github.io/sdw/bp/#dfn-coverage> such as satellite imagery,
sensor measurement time-series and climate prediction data, are often very
large. In these cases it is useful to provide subsets by having identifiers
for conveniently sized subsets of large datasets that Web applications can
work with.
```

Effectively, breaking up a large coverage into pre-defined lumps that you
can access via HTTP Get requests is a _very simple_ API!

In the examples for SDW BP13 we refer to DataCube slices. This is already
covered in DWBP so we can ditch that. Another of the [suggested] examples
is "Mapping a URI template (as specified in [RFC6570
<http://w3c.github.io/sdw/bp/#bib-RFC6570>]) to a WCS
<http://w3c.github.io/sdw/bp/#dfn-web-coverage-service-wcs> or OPeNDAP
<http://www.opendap.org/> service end-point". Reflecting on this, I wonder
if this approach should be listed as a mechanism that can help to "Reuse
your existing spatial data infrastructure" - as stated in BP11? You already
mention "wrapper, proxy or a shim layer", but the mentioning the URI
template would be useful. Alternatively, Example 22 (talking about the
Environment Agency Bathing Water Quality API and the Linked Data API) might
be a good point too; as the Linked Data API configuration uses URI
templates to provide RESTful access to SPARQL queries thereby taking away
from the user the challenge of writing generalised SPARQL queries and
understanding the underpinning data model. In fact, I think it would be
worth fleshing out this example anyway.

(for reference, documentation on Epimorhic's implementation "ELDA" can be
found here: http://epimorphics.github.io/elda/current/index.html)

Finally, I wonder whether we have a gap. Currently BP13 talks about using "
PROV-O <https://www.w3.org/TR/prov-o/> to describe the relationship between
the subset, the original large dataset and the mechanism used to derive the
subset". I'm not so worried about PROV-O, but I think that it would be
worth asserting that it is useful to relate the sub-set to the complete
resource from whence it came. Re-reading your edits to BP11, I think that
we may have this covered where you talk about "paging" responses (using LDP
or Hydra pagination).

Hope that helps.

Jeremy

On Wed, 1 Mar 2017 at 17:41 Clemens Portele <
portele@interactive-instruments.de> wrote:

> Hi all,
>
> in the BP call today [1] we discussed, if BP 13 [2] could or should be
> removed.
>
> The rationale would be:
> * DWBP now has BP 18 ("Provide Subsets for Large Datasets") [3] which has
> almost the same name and already covers most of the aspects. It also
> mentions the RDF Data Cube Vocabulary.
> * DW BP 18 is referenced and discussed in the introduction of section 12.6
> and in BP 11 [4].
> * Currently it feels as if there is not enough content left to keep a
> separate BP providing actionable guidance (beyond what is already in DW BP
> 18 and SDW BP 11 on that topic).
> * If content from BP13 should be kept, it could be integrated into BP 11.
>
> Any thoughts?
>
> Clemens
>
> [1] https://www.w3.org/2017/03/01-sdwbp-minutes.html
> [2] http://w3c.github.io/sdw/bp/#ids-for-chunks
> [3] https://www.w3.org/TR/dwbp/#ProvideSubsets
> [4] http://w3c.github.io/sdw/bp/#bp-exposing-via-api
>

Received on Thursday, 2 March 2017 12:34:12 UTC