RE: Webby Data

Dear all,

The "URI Question" comes back periodically as it is not something that can be explained with a small section in the DWBP specs.

The first step for data on (and off) the web is proper identification, taking into account all the aspects, such as:

- Resource and variants (format, language, version)
- Direct identification of variants and with content negotiation
- Metadata
- Granularity - fragments (# vs. /)

The URI Question will not go away until it is addressed comprehensibly.
 http://dragoman.org/comuri


Regards
Tomas

-----Original Message-----
From: Erik Wilde [mailto:dret@berkeley.edu] 
Sent: Monday, October 12, 2015 6:51 PM
To: Makx Dekkers; 'Phil Archer'; 'Public DWBP WG'
Cc: 'Tandy, Jeremy'
Subject: Re: Webby Data

hello makx.

On 2015-10-10 23:47, Makx Dekkers wrote:
> 2. Identification if parts of a dataset. I think that is want you mean by ‘data point’ but maybe that term is not the best, as it seems to imply some numerical value for an observation. I myself would favour a term like ‘part of a dataset’ or ‘data item’.
>
> On this second issue, we may need to include some warnings. In some cases, a part of a dataset by itself may not be understandable without access to information about the dataset as a whole; e.g. for an observation, you may need to know how and why it was observed; for an article in a law, you may need to know what a particular term means in this specific context.

maybe i am mistaken, but i think part of the motivation here was my 
comment to please talk about URI fragments. if you're using fragments, 
then the context *is* in the resource, as the fragment URI reference 
simply identifies a sub-resource within it. that's the neat thing about 
fragment identifiers: you don't lose the context, because resource 
granularity guarantees that it remains intact.

> One approach would certainly be to create URIs that are in some way derived from the dataset URI, which I understand is the approach of CSVW at http://www.w3.org/TR/tabular-metadata/#uri-template-properties. However, in the absence of a ‘standard’ way of creating ‘item URIs’ from dataset URIs, it may not be possible to know what the dataset URI is from looking at the item URI, at least not in a machine-readable way.

this *only* applies if dataset granularity is different from resource 
granularity, i.e. the publisher decides to serve a dataset as multiple 
resources. maybe it is worth to be very explicit about the fact that 
different resource granularity probably should imply different 
navigational structures, at the hypermedia level.

if this is about multi-resource datasets, then i think it is important 
to be very clear that *any* method that involves URI patterns has to be 
explicit (i.e., any kind of URI hacking is an anti-patterns and should 
be avoided). it can be URI templates, or much more often it probably 
simply is a link. the existing web link relations will help you with 
this, for example a "start" link will tell clients how to get from some 
resource to a starting point of a resource collection.

http://www.iana.org/assignments/link-relations/link-relations.xhtml


there are more existing and useful link relations there (such as the 
RFC5005 ones, which now for some weird reason now seem to be partly 
"overwritten" by HTML5), and maybe it would be worth mentioning them 
explicitly, so that people reuse link relations when possible, instead 
of inventing their own.

cheers,

dret.

-- 
erik wilde | mailto:dret@berkeley.edu  -  tel:+1-510-2061079 |
            | UC Berkeley  -  School of Information (ISchool) |
            | http://dret.net/netdret http://twitter.com/dret |

Received on Tuesday, 13 October 2015 12:35:47 UTC