- From: Luc Moreau <L.Moreau@ecs.soton.ac.uk>
- Date: Wed, 25 May 2011 08:42:43 +0100
- To: public-prov-wg@w3.org
Hi Paul, Yesterday, I also began drafting some definition. We need representations in here too. I am not sure about your illustrations. Here is my take on it: From a provenance viewpoint, we seem to discuss several concepts related to resources. Some terminology is required to disambiguate concepts. It is inspired by terminology developed by the rdf working group (thanks to Sandro for drafting the original email!) 1. A "resource" is a container, whose contents may vary over time. Its content may be structured in many different ways (hierarchical XML tree, RDF arcs, etc). 2. A "r-snapshot" is a state of a resource, or a snapshot of that resource at a specific instant. A r-snapshot is immutable. From a resource that changes over time, one can obtain multiple r-snapshots. 3. A "r-text" is a particular sequence of characters or bytes which conveys a particular r-snapshot in some language. If you can parse a r-text, you know what is in the r-snapshot it conveys. You can tell someone exactly what is in a particular resource at some instant by sending them a r-text. (You send them the r-text which conveys the r-snapshot which is the current state of that resource.) In some cases, some resources do not vary over time, which means that there is a single r-snapshot for them, and some may even have a single r-text (no content negotiation). In such a specific case (static resources on the web), the three concepts conflate into a single one. The challenge is to deal with dynamic contents. Illustration inspired by the example. - government (gov) converts data (d1) to RDF file (f1) at time (t1) using xlst transform - government (gov) uploads RDF data (f1) into a triple store, exposed as Web resource (r1) - analyst (alice) downloads a turtle serialization (lcp1) of the resource (r1) from government portal Illustrations: - r1: is a resource: it's the triple store, its a container, its content can vary over time - lcp1: is a r-text (turtle serialization) of a given snapshot (created by, or available at the time of, download) - f1 is a local file: it can be seen as a stateless anonymous resource, with a single r-text. If in addition: - analyst (alice) downloads a rdf/xml serialization (lcp2) of the resource (r1) If the content of r1 has not changed, then lcp2 and lcp1 are both r-texts of a same r-snapshot. Note that this is not limited to RDF (as Graham mentioned) - newspaper (news), uses a CMS to publish the incidence map (map1), chart (c1) and the image (img1) within a document (art1) written by (joe) using license (li2) - newspaper (news), updates art1, adding a correction following a complaint from a reader Illustrations: - art1 is a also resource, with two r-snapshots (before and after correction) - with language negotiation, an http client can download html and xhtml representations (i.e., r-texts) of the article What do you think? Cheers, Luc On 05/25/2011 06:49 AM, Paul Groth wrote: > Hi, > > To throw out some, perhaps simpler, definitions into the mix that I > think follow along the lines of what's being discussed. > > Resource - something that can be identified > > Snapshot - the state of a resource at particular point in time > > In the Data Journalism Scenario: a 'resource' would be the web page. a > 'snapshot' would be the web page before publication. > > cheers, > Paul > > Note: Similar concepts are found within many provenance models that I > know of....if it's helpful I can list those out > -- Professor Luc Moreau Electronics and Computer Science tel: +44 23 8059 4487 University of Southampton fax: +44 23 8059 2865 Southampton SO17 1BJ email: l.moreau@ecs.soton.ac.uk United Kingdom http://www.ecs.soton.ac.uk/~lavm
Received on Wednesday, 25 May 2011 07:43:20 UTC