- From: Phil Archer <phila@w3.org>
- Date: Fri, 1 Jan 2016 09:38:48 +0000
- To: Dan Brickley <danbri@google.com>, Clemens Portele <portele@interactive-instruments.de>, Rob Atkinson <rob@metalinkage.com.au>
- Cc: Simon Cox <Simon.Cox@csiro.au>, amgreiner@lbl.gov, ericphb@gmail.com, jeremy.tandy@metoffice.gov.uk, koubarak@di.uoa.gr, public-dwbp-comments@w3.org, public-sdw-comments@w3.org
On 31/12/2015 11:09, Dan Brickley wrote: > Isn't a "subset" just a query result, or which there are effectively an > unlimited number? Yes. > > Storing a query so it can be re-run against evolving data has value. Having > a URI for that, perhaps less so. http://xmlns.com/foaf/spec/ for example? That's a very stable URI for an evolving document/dataset that many people find useful ;-) Phil. > > Dan > > On Thu, 31 Dec 2015, 08:14 Clemens Portele < > portele@interactive-instruments.de> wrote: > >> Rob, >> >> what you describe seems to apply to the dataset (resource) the same way it >> would apply to any subset resource. I.e. are you discussing a more general >> question, not the subsetting question? >> >> Phil, >> >> a (probably often unproblematic) restriction to the temperature/uk/london >> or stations/manchester approach is that there is only one path, so you end >> up with limitations on the subsets. If you want to support multiple >> subsets, e.g. also stations where high speed trains stop, stations that >> have a ticket shop, etc. then there are several issues with a >> /{dataset}/{subset}/…/{subset}/{object} approach. These include an unclear >> URI scheme ("manchester" and "eurostar" would be on the same path level), >> potential name collisions of subset names of different subsetting >> categories, and multiple URIs for the same feature/object. >> >> Best regards, >> Clemens >> >> >> On 31 Dec 2015, at 03:07, Rob Atkinson <rob@metalinkage.com.au> wrote: >> >> I'm not a strong set-theoretician - but it strikes me there are some >> tensions here: >> >> Does the identifier of a set mean that the members of that set are >> constant, known in advance and always retrievable? Is a query endpoint a >> resource (does either URI or URL have meaning against a query that delivers >> real time data - including the use case of "at this point in time we think >> these things are members of this set?" ) >> >> If the subset is the result of a query - and you care that it is the same >> subset another time you look at it - are you actually assigning an >> identifier to the artefact - which is the query response, whose properties >> include the original query, where it was made, and the time it was made? >> >> Can you define an ontology for terms like subset, query, response that you >> all agree on? >> >> I share Phil's implicit concern that subsetting by type with URI patterns >> may not be universally applicable - IMHO that equates to a "sub-register" >> pattern, where a set has its members defined by some identifiable process >> (indepent of any query functions available) - which may include explicit >> subsets - for example by object type, or delegated registration processes. >> That probably fits the UK implementation better than a query-defined >> subset. >> >> If subsets have some prior meaning - and a query is used to access then >> from a service endpint - then the query is a URL that needs to be bound to >> the object URI. AFAICT thats a very different thing to saying an arbitrary >> query result defines a subset of data. >> >> I think you may, in general, assign an ID to the artefact which is the >> result of a query at a given time, and if you want to make that into >> something with more semantics then you need make it into a new type of >> object which can be described in terms of what it means. I think currently >> the conversation is conflating these two perspectives of "subset". >> >> Cheers, and farewell to 2015. >> Rob Atkinson. >> >> >> >> >> On Thu, 31 Dec 2015 at 08:26 <Simon.Cox@csiro.au> wrote: >> >>> Another way of looking at it is that a query, encoded as a URI pattern, >>> defines an implicit set of potential URIs, each of which denotes a subset. >>> >>> Simon J D Cox >>> Environmental Informatics >>> CSIRO Land and Water >>> >>> E simon.cox@csiro.au T +61 3 9545 2365 M +61 403 302 672 >>> Physical: Central Reception, Bayview Avenue, Clayton, Vic 3168 >>> Deliveries: Gate 3, Normanby Road, Clayton, Vic 3168 >>> Postal: Private Bag 10, Clayton South, Vic 3169 >>> http://people.csiro.au/Simon-Cox >>> http://orcid.org/0000-0002-3884-3420 >>> http://researchgate.net/profile/Simon_Cox3 >>> >>> ------------------------------ >>> *From:* Phil Archer >>> *Sent:* Wednesday, 30 December 2015 6:31:16 PM >>> *To:* Manolis Koubarakis; 'public-sdw-comments@w3.org'; Annette Greiner; >>> Eric Stephan; Tandy, Jeremy; public-dwbp-comments@w3.org >>> *Subject:* Subsetting data >>> >>> At various times in recent months I have promised to look into the topic >>> of persistent identifiers for subsets of data. This came up at the SDW >>> F2F in Sapporo but has also been raised by Annette in DWBP. In between >>> festive activities I've been giving this some thought and have tried to >>> begin to commit some ideas to a page [1]. >>> >>> During the CEO-LD meeting, Jeremy pointed to OpenSearch as a possible >>> way forward, including its geo-temporal extensions defined by the OGC. >>> There is also the Linked Data API as a means of doing this, and what >>> they both have in common is that they offer an intermediate layer that >>> turns a URL into a query. >>> >>> How do you define a persistent identifier for a subset of a dataset? IMO >>> you mint a URI and say "this identifies a subset of a dataset" - and >>> then provide a means of programmatically going from the URI to a query >>> that returns the subset. As long as you can replace the intermediate >>> layer with another one that also returns the same subset, we're done. >>> >>> The UK Government Linked Data examples tend to be along the lines of: >>> >>> http://transport.data.gov.uk/id/stations >>> returns a list of all stations in Britain. >>> >>> http://transport.data.gov.uk/id/stations/Manchester >>> returns a list of stations in Manchester >>> >>> http://transport.data.gov.uk/id/stations/Manchester/Piccadilly >>> identifies Manchester Piccadilly station. >>> >>> All of that data of course comes from a single dataset. >>> >>> Does this work in the real worlds of meteorology and UBL/PNNL? >>> >>> Phil. >>> >>> >>> >>> >>> [1] https://github.com/w3c/sdw/blob/gh-pages/subsetting/index.md >>> >>> >>> >>> >>> -- >>> >>> >>> Phil Archer >>> W3C Data Activity Lead >>> http://www.w3.org/2013/data/ >>> >>> http://philarcher.org >>> +44 (0)7887 767755 >>> @philarcher1 >>> >>> >> > -- Phil Archer W3C Data Activity Lead http://www.w3.org/2013/data/ http://philarcher.org +44 (0)7887 767755 @philarcher1
Received on Friday, 1 January 2016 09:38:10 UTC