W3C home > Mailing lists > Public > public-dwbp-comments@w3.org > January 2016

Re: Subsetting data

From: Phil Archer <phila@w3.org>
Date: Fri, 1 Jan 2016 09:05:25 +0000
To: Simon.Cox@csiro.au, public-sdw-comments@w3.org, public-dwbp-comments@w3.org
Message-ID: <56864155.30109@w3.org>


On 30/12/2015 21:26, Simon.Cox@csiro.au wrote:
> Another way of looking at it is that a query, encoded as a URI pattern, defines an implicit set of potential URIs, each of which denotes a subset.

True, but to be persistent, identifiers should not include queries 
against a specific API or query endpoint. That, for me, is the key 
point. OpenSearch provides a model where a query is included in a URL 
that can be considered persistent because there is a layer of 
indirection that could be changed without the URL changing, but a URL 
that includes a SQL or SPARQL query directly must be considered 
ephemeral IMO.

Phil


>
> Simon J D Cox
> Environmental Informatics
> CSIRO Land and Water
>
> E simon.cox@csiro.au T +61 3 9545 2365 M +61 403 302 672
> Physical: Central Reception, Bayview Avenue, Clayton, Vic 3168
> Deliveries: Gate 3, Normanby Road, Clayton, Vic 3168
> Postal: Private Bag 10, Clayton South, Vic 3169
> http://people.csiro.au/Simon-Cox
> http://orcid.org/0000-0002-3884-3420
> http://researchgate.net/profile/Simon_Cox3
>
> ________________________________
> From: Phil Archer
> Sent: Wednesday, 30 December 2015 6:31:16 PM
> To: Manolis Koubarakis; 'public-sdw-comments@w3.org'; Annette Greiner; Eric Stephan; Tandy, Jeremy; public-dwbp-comments@w3.org
> Subject: Subsetting data
>
> At various times in recent months I have promised to look into the topic
> of persistent identifiers for subsets of data. This came up at the SDW
> F2F in Sapporo but has also been raised by Annette in DWBP. In between
> festive activities I've been giving this some thought and have tried to
> begin to commit some ideas to a page [1].
>
> During the CEO-LD meeting, Jeremy pointed to OpenSearch as a possible
> way forward, including its geo-temporal extensions defined by the OGC.
> There is also the Linked Data API as a means of doing this, and what
> they both have in common is that they offer an intermediate layer that
> turns a URL into a query.
>
> How do you define a persistent identifier for a subset of a dataset? IMO
> you mint a URI and say "this identifies a subset of a dataset" - and
> then provide a means of programmatically going from the URI to a query
> that returns the subset. As long as you can replace the intermediate
> layer with another one that also returns the same subset, we're done.
>
> The UK Government Linked Data examples tend to be along the lines of:
>
> http://transport.data.gov.uk/id/stations
> returns a list of all stations in Britain.
>
> http://transport.data.gov.uk/id/stations/Manchester
> returns a list of stations in Manchester
>
> http://transport.data.gov.uk/id/stations/Manchester/Piccadilly
> identifies Manchester Piccadilly station.
>
> All of that data of course comes from a single dataset.
>
> Does this work in the real worlds of meteorology and UBL/PNNL?
>
> Phil.
>
>
>
>
> [1] https://github.com/w3c/sdw/blob/gh-pages/subsetting/index.md
>
>
>
>
> --
>
>
> Phil Archer
> W3C Data Activity Lead
> http://www.w3.org/2013/data/
>
> http://philarcher.org
> +44 (0)7887 767755
> @philarcher1
>
>

-- 


Phil Archer
W3C Data Activity Lead
http://www.w3.org/2013/data/

http://philarcher.org
+44 (0)7887 767755
@philarcher1
Received on Friday, 1 January 2016 09:04:49 UTC

This archive was generated by hypermail 2.3.1 : Friday, 1 January 2016 09:04:50 UTC