- From: Graham Klyne <GK@ninebynine.org>
- Date: Wed, 25 May 2011 13:39:42 +0100
- To: Luc Moreau <L.Moreau@ecs.soton.ac.uk>
- CC: public-prov-wg@w3.org
Luc, Considering your example of l-value and r-value, I think it's the implication of dereferencability and updatability that comes with a container that I feel is over constraining. I don't want to prohibit containers or modifiable entities as resources (with provenance), I just don't think that all resources with provenance are necessarily containers in this sense. #g -- Luc Moreau wrote: > Nothing in the example is restricted to rdf or triple stores. > It also applies to a table in a relational database (and its xml > serialization), > or an excel spreadsheet (and a csv representation). > > The relational database/table and the spreadsheet can be seen as > containers, since > they can be updated. > > The reason why it is important is that we need to consider stateful > resources (well, > I think so, don't you?). > > An alternative way of looking at it, adopting some old programming language > terminology, is this: > > a resource is like a l-value > a snapshot is like a r-value > a r-text is like a representation of a r-value > > Luc > > On 05/25/2011 09:53 AM, Graham Klyne wrote: >> I have a problem with resource-as-container. I think it's too >> constraining. My zebra example wouldn't comply. >> >> As for the distinction between f1 and r1 per your example, I think >> this is rather broadening the discussion - which I'm not sure is >> necessary or helpful. >> >> I would say that in this case, r1 is a service resource. And as such, >> I don't think it makes sense to download a service. E.g. what to you >> receive if you do s simple HTTP GET in a SPARQL endpoint URI? I think >> it's typically some kind of intro page that explains how to use the >> service (e.g. http://data.clarosnet.org/sparql/). The URIs that may >> be used to download *content* from the triple store are different >> (e.g. URI-encoded SPARQL queries, or constructed LDAPI URIs). >> >> So, for the purposes of this example, we need to be clearer about what >> we mean when saying "analyst (alice) downloads a turtle serialization >> (lcp1) of the resource (r1) from government portal" - in this context, >> I don't think it makes sense as it stands. >> >> I also note that once you introduce a triple store into the mix, while >> we can expect it to contain information that has been loaded into it, >> when retrieving information, we have no a priori way to claim that the >> information subsequently retrieved has to do with the original >> resource. The best we can say is that if the entire *content* of the >> resource "r1" is downloaded, then that content should contain as a >> subset the RDF that was loaded. But even this isn't clear-cut - if >> the triple store supports named graphs (which most do), then there's >> no way to represent its entire content in a single Turtle download. >> >> In summary, I think the introduction of containers and triple stores >> is mixing mechanism with essential provenance concepts here, and I >> think we need to get the former straight before we can explain what >> happens when more complex mechanisms are introduced. The scenario as >> described could playperfectly well without mention of a triple store. >> >> #g >> -- >> >> >> Luc Moreau wrote: >>> Hi Paul, >>> Yesterday, I also began drafting some definition. We need >>> representations in here too. I am not sure about >>> your illustrations. Here is my take on it: >>> >>> >>> >>> >>> From a provenance viewpoint, we seem to discuss several concepts >>> related to resources. Some terminology is required to disambiguate >>> concepts. It is inspired by terminology developed by the rdf working >>> group (thanks to Sandro for drafting the original email!) >>> >>> >>> 1. A "resource" is a container, whose contents may vary over time. >>> Its content may be structured in many different ways (hierarchical >>> XML tree, RDF arcs, etc). >>> >>> 2. A "r-snapshot" is a state of a resource, or a snapshot of that >>> resource at a specific instant. A r-snapshot is immutable. From a >>> resource that changes over time, one can obtain multiple >>> r-snapshots. >>> >>> 3. A "r-text" is a particular sequence of characters or bytes which >>> conveys a particular r-snapshot in some language. If you can parse >>> a r-text, you know what is in the r-snapshot it conveys. You can >>> tell someone exactly what is in a particular resource at some >>> instant by sending them a r-text. (You send them the r-text which >>> conveys the r-snapshot which is the current state of that resource.) >>> >>> >>> >>> In some cases, some resources do not vary over time, which means that >>> there is a single r-snapshot for them, and some may even have a >>> single r-text >>> (no content negotiation). In such a specific case (static resources >>> on the web), >>> the three concepts conflate into a single one. >>> >>> The challenge is to deal with dynamic contents. >>> >>> >>> >>> Illustration inspired by the example. >>> >>> - government (gov) converts data (d1) to RDF file (f1) at time (t1) >>> using xlst transform >>> - government (gov) uploads RDF data (f1) into a triple store, exposed >>> as Web resource (r1) >>> - analyst (alice) downloads a turtle serialization (lcp1) of the >>> resource (r1) from government portal >>> >>> Illustrations: >>> - r1: is a resource: it's the triple store, its a container, its >>> content can vary over time >>> - lcp1: is a r-text (turtle serialization) of a given snapshot >>> (created by, or available at the time of, download) >>> - f1 is a local file: it can be seen as a stateless anonymous >>> resource, with a single r-text. >>> >>> If in addition: >>> - analyst (alice) downloads a rdf/xml serialization (lcp2) of the >>> resource (r1) >>> >>> If the content of r1 has not changed, then lcp2 and lcp1 are both >>> r-texts of a same r-snapshot. >>> >>> Note that this is not limited to RDF (as Graham mentioned) >>> >>> - newspaper (news), uses a CMS to publish the incidence map (map1), >>> chart (c1) and >>> the image (img1) within a document (art1) written by (joe) using >>> license (li2) >>> - newspaper (news), updates art1, adding a correction following a >>> complaint from a reader >>> >>> Illustrations: >>> - art1 is a also resource, with two r-snapshots (before and after >>> correction) >>> - with language negotiation, an http client can download html and >>> xhtml representations (i.e., r-texts) of the article >>> >>> >>> >>> What do you think? >>> Cheers, >>> Luc >>> >>> >>> On 05/25/2011 06:49 AM, Paul Groth wrote: >>>> Hi, >>>> >>>> To throw out some, perhaps simpler, definitions into the mix that I >>>> think follow along the lines of what's being discussed. >>>> >>>> Resource - something that can be identified >>>> >>>> Snapshot - the state of a resource at particular point in time >>>> >>>> In the Data Journalism Scenario: a 'resource' would be the web page. >>>> a 'snapshot' would be the web page before publication. >>>> >>>> cheers, >>>> Paul >>>> >>>> Note: Similar concepts are found within many provenance models that >>>> I know of....if it's helpful I can list those out >>>> >>> >> >
Received on Wednesday, 25 May 2011 12:40:33 UTC