Provenance from Christopher Gutteridge on 2014-05-21 (public-csv-wg@w3.org from May 2014)

From: Christopher Gutteridge <cjg@ecs.soton.ac.uk>
Date: Wed, 21 May 2014 10:02:55 +0100
To: W3C CSV on the Web Working Group <public-csv-wg@w3.org>
Message-ID: <EMEW3|51b6f040ab9819acdd75dc1668dcb6ccq4KA3903cjg|ecs.soton.ac.uk|537C6BBF.707@>

While it's not a top priority, I see an exciting use for some of the 
recent provenance vocab. work. For the Tabular(CSV)->Graph(RDF) route 
anyhow, as it's possible to add extra triples. We may well know the URI 
of the source table, and the URI of the metadata document. That's 
provenance right there. I would suggest (not as a high priority) that a 
recommended RDF way to express this relationship could be included in 
this work. eg. The triples in the output RDF saying it was generated 
from source document(s) X, using metadata Y and process Z at a given 
time & date by an agent (the organisation/person/system making the 
conversion).

It should be just a handful of extra triples, and optional, but it would 
be good to give people a standard to follow. And also URIs to reference 
for the process followed (the algorithms being discussed now).

You can see an example of what I mean at the top of this TTL file:
http://data.southampton.ac.uk/dumps/jargon/2014-05-08/jargon.ttl
(ignore the http://purl.org/void/provenance/ns/ triples, that was the 
previous vocab we used and are now transitioning to 
http://www.w3.org/ns/prov#)

-- 
Christopher Gutteridge -- http://users.ecs.soton.ac.uk/cjg

University of Southampton Open Data Service: http://data.southampton.ac.uk/
You should read the ECS Web Team blog: http://blogs.ecs.soton.ac.uk/webteam/

Received on Wednesday, 21 May 2014 09:03:40 UTC