BPs on Data Identification

Dear all,

Hadley and I took an action *way* back to offer some new text for the 
section on Data Identification. I have finally got around to doing it, 
although I am sure I could add more in future.

The text I have (re)written is on the wiki at [1]. My suggestion is that 
this replaces the text in the current editor's draft as Data 
Identification [2]. It extends the intro and creates a new BP around IDs 
for versions.

In more detail:

The section intro is unambiguous about identifiers. The current text 
says that datasets should have identifiers. My version says that should 
have URIs (for W3C, this is gospel).

I felt it wise to set out why URIs are so important and the differences 
between URL, URI and IRI.

I've taken this a little further and set out what I regard as some of 
the key points about URIs (currently 5 of them). This is drawn from 
experience of talking to various groups. I am guilty of assuming that 
everyone knows this stuff - which I really, really shouldn't!

BP: User persistent URIs as identifiers.

This is a modified version of the existing BP. It introduces the notion 
of persistence and, if the WG adopts this, I plan to include the 10 
Rules for Persistent Identifiers that the current text links to.

It also refers to purl.org and DOIs as alternatives.

Then I added a new BP on identifiers for versions. This doesn't 
contradict the BP in the previous section that talks about Memento but 
the two together are perhaps a little awkward - something to discuss.

I put all this on the wiki, not in the BP doc as it is for the editors 
and the WG to decide whether this is going in the right direction. I can 
add it to the doc quickly if needed (needless to say with or without 



For tracker, this is Issue-157

[1] https://www.w3.org/2013/dwbp/wiki/BP_Data_Identification
[2] http://w3c.github.io/dwbp/bp.html#DataIdentification


Phil Archer
W3C Data Activity Lead

+44 (0)7887 767755

Received on Wednesday, 3 June 2015 18:25:44 UTC