Re: Review of BP Enable Data Subsetting from Bernadette Farias Lóscio on 2016-03-23 (public-dwbp-wg@w3.org from March 2016)

From: Bernadette Farias Lóscio <bfl@cin.ufpe.br>
Date: Wed, 23 Mar 2016 09:46:32 -0300
To: Deirdre Lee <deirdre@derilinx.com>
Cc: "public-dwbp-wg@w3.org" <public-dwbp-wg@w3.org>
Message-ID: <CANx1PzxHy9mn1bi2o9+XbePypzofNkL4jjncMDszF2gMOpjJwA@mail.gmail.com>

Hi all,

I think that subsetting is an important subject and we should talk about
it. However, I don't agree with keeping the BP [1]. I agree with Laufer
that the "Possible Approach of Implementation" is too narrow and it should
be improved. I also think that it's gonna be really hard to test this BP
because the "How to test" is also very subjective.

"Compare your assessment of the expected use cases and check that each of
the subsets that you expect to be requested can be returned from a single
identifier. Check also that the granularity of subsetting is consistent
throughout the dataset."

How someone can identify each one of the subsets that are expected to be
requested? What is a subset? What is the granularity of subsetting?

I think that if we're gonna use these terms than they should have a "more
concrete" definition.

Cheers,
Berna

[1] http://w3c.github.io/dwbp/bp.html#EnableDataSubsetting

2016-03-23 8:37 GMT-03:00 Deirdre Lee <deirdre@derilinx.com>:

> Hi,
>
> I agree that we should keep the BP on Data Subsetting.
>
> While using an API (or SPARQL endpoint, etc. ) is a best practice for
> grabbing a subset of a larger dataset, a lot of issues people are facing
> around this topic is how to model sub-datasets in DCAT, and how to
> represent them in CKAN.
>
> Therefore, I suggest the example section include not only an API example,
> but also an example using DataCube and DCAT.
> Otherwise, we should explicitly say it's not best practice to model
> subsetting in DCAT.
>
> Cheers,
> Deirdre
>
>
> On 23/03/2016 03:23, Laufer wrote:
>
>
>
> Hi All,
>
> Considering that we have a BP "Provide bulk download", it makes sense to
> also have a BP about providing "subset download".
>
> My comment is about the "Possible Approach of Implementation" that I think
> is too narrow, talking only about an API as a way of access to subsets.
>
> I think it would be nice to talk about URI Templates (RFC6570), Linked
> Data API [2] and even about a SPARQL endpoint as possible approaches of
> implementations for accessing  subsets.
>
> Cheers, Laufer
>
> [1]  https://tools.ietf.org/html/rfc6570
>
> [2]  http://www.epimorphics.com/web/projects/linked-data-api
>
>
> --
>
> .  .  .  .. .  .
> .        .   . ..
> .     ..       .
>
>
>
>
> --
> ------------------------------------
> Deirdre Lee, CEO & Founder
> Derilinx - Linked & Open Data Solutions
>
> Web:      www.derilinx.com
> Email:    deirdre@derilinx.com
> Address:  11/12 Baggot Court, Dublin 2, D02 F891
> Tel:      +353 (0)1 254 4316
> Mob:      +353 (0)87 417 2318
> Linkedin: ie.linkedin.com/in/leedeirdre/
> Twitter:  @deirdrelee
>
>


-- 
Bernadette Farias Lóscio
Centro de Informática
Universidade Federal de Pernambuco - UFPE, Brazil
----------------------------------------------------------------------------

Received on Wednesday, 23 March 2016 12:47:21 UTC