Re: PROV-ISSUE-1 (define-resource): Definition for concept 'Resource' [Provenance Terminology] from Luc Moreau on 2011-05-24 (public-prov-wg@w3.org from May 2011)

From: Luc Moreau <L.Moreau@ecs.soton.ac.uk>
Date: Tue, 24 May 2011 22:34:13 +0000
To: Simon Miles <simon.miles@kcl.ac.uk>
CC: "public-prov-wg@w3.org" <public-prov-wg@w3.org>
Message-ID: <EMEW3|d58dde2cf23da80e8ba1737b1adbc690n4NNYe08L.Moreau|ecs.soton.ac.uk|35326799>
Hi Simon,

I have some difficulty in referring to provenance in the definition of a resource.
This is a circular definition since we see resource as one of the concepts used to formulate
Provenance.

Luc

On 24 May 2011, at 22:40, "Simon Miles" <simon.miles@kcl.ac.uk> wrote:

> Hello,
> 
> With regard to the points raised on resources, in brief I suggest:
> - For our purposes, a resource is anything which can be referred to
> and has a provenance.
> - This is equivalent to "anything that might be identified by a URI"
> anyway, so it seems sensible to use that existing definition.
> - When we talk about the provenance of a resource, we mean the
> provenance of its state on asking the question.
> - When we talk about the provenance of a resource state
> representation, we mean the provenance of its state plus how it came
> to be in that representation.
> - We would expect implementers of the recommendation to provide
> access to the provenance of a web resource state representation, but
> by the suggestions above this would anyway be the provenance of the
> resource state (just by ignoring the portion specifically relating to
> representation), and that state's provenance is equivalent to the
> resource's provenance.
> 
> In less brief, the reasons for the suggestions above:
> 
> It seems intuitive to me that what a user, or a client on their
> behalf, would ask for or expect is the provenance of a resource (in
> the web architecture sense, (a) in Luc's list). As this might be
> mutable, and so does not have one history over time, it makes sense to
> me to specify that the provenance of a resource is the provenance of
> its state on asking the question.
> 
> I agree with Jun that it would be good to include non-web resources,
> but then agree with Paul that the web architecture definition captures
> all we would want, just expressed in a way which is unusual for
> non-web settings. If we accept the above suggestion that a "resource"
> is what we'd ask for the provenance of, then surely all we mean by
> resource is something which can be referred to and which has a
> provenance? If so, then I think "might be identified with a URI" is
> one way of describing this - else, what could be referred to but could
> not be identified with URI? and what could be identified but does not
> have a provenance?
> 
> With regards to (a) resource, (b) state and (c) representation, I
> think it makes sense to talk about the provenance of any of the three.
> Taking Graham's example, if (a) is the zebra's health, (b) is the
> zebra's health at some point in time, and (c) is a medical record
> about the zebra's health, I can envisage a meaningful response to
> asking the history of the zebra's health (a), how its health came to
> be as it is now (b) which is effectively the same as (a), or why the
> record contains what it does (c). For the purposes of provenance, it
> seems that (c) is just (b) with a bit of extra information (details of
> the particular representation) and so the provenance of (c) is just
> the provenance of (b) plus some extra (ignorable) information on how
> it can to be represented as it is.
> 
> Graham - I don't understand your argument for why a web resource
> state's ((b)'s) provenance would not be meaningful. The provenance of
> the government data at the time it was first published, for example,
> would refer to the studies which produced it, while the provenance of
> its Turtle representation would be the same plus information about
> serialisation in Turtle.
> 
> In a mail to this list which I think got lost, I said that in the
> government example I didn't understand the difference between f1 being
> "published" and r1 being "made available as a web resource", so I'm
> not clear enough on the difference between f1 and r1 to use to
> illustrate the suggestions above.
> 
> Thanks,
> Simon
> 
> On 24 May 2011 21:13, Graham Klyne <GK@ninebynine.org> wrote:
>> Hi Luc,
>> 
>> Trimming the message this time!
>> 
>> Luc Moreau wrote:
>>  >(I wrote):
>>>> I don't think there's a need or purpose to invoke that terminology here.
>>>> 
>>>> Just consider, for the sake of discussion, a slight revision of the
>>>> example:
>>>> 
>>>> government (gov) converts data (d1) to XML (f1) at time (t1)
>>>> government (gov) generates provenance information (prov) regarding XML
>>>> (f1)
>>>> government (gov) publishes XML data (f1) along with its provenance
>>>> (prov) on a portal with a license (li1); the XML data is now available
>>>> as a Web resource (r1)
>>>>  :
>>>> 
>>>> I think the example makes just as much sense with RDF replaced by XML,
>>>> but the RDF terminology does not apply to XML data.  And, by the way,
>>>> I think this revised example also represents a use-case that we MUST
>>>> be able to support (except that instead of talking about Turle and
>>>> RDF/XML serializations, we might talk about text/XML vs EXI
>>>> (http://www.w3.org/TR/2011/REC-exi-20110310/) serializations.
>>> 
>>> I agree that it could be xml.  But the problem is still the same.
>>> THe web architecture distinguishes
>>> - resource
>>> - resource state
>>> - resource state representation
>>> 
>>> The rdf WG has introduced terminology for rdf corresponding to these
>>> concepts.
>>> 
>>> If we want to explain how provenance fits into the web architecture, we
>>> need to be able
>>> to refer to these notions.
>> 
>> OK, I see two discussion points here:
>> 
>> (a) the relevance of the RDF g-box, g-snap, g-text terminology, and
>> 
>> (b) the need to express provenance about resources/resource state/resource state
>> representation
>> 
>> Regarding (a), I think the (resources/resource state/resource state
>> representation) terminology is perfectly adequate for our current purposes, and
>> that avoids getting drawn into RDF-specific issues of RDF graph evolution.
>> Later, when we (maybe) discuss more specifically management of provenance
>> expressed using RDF, I can imagine the g-box/... terminology might be helpful.
>> 
>> Regarding (b), I've offered a viewpoint, but I remain open to persuasion.  But I
>> don't think focusing on the g-box/g-snap/g-text is going to help us here,
>> because the Web Architecture concepts are so much broader (i.e. not just RDF).
>> More important, IMO, is to identify a specific scenario that isn't adequately or
>> so easily handled by the provenance-of-resource case.
>> 
>> #g
>> --
>> 
>> 
>> 
>> ______________________________________________________________________
>> This email has been scanned by the MessageLabs Email Security System.
>> For more information please visit http://www.messagelabs.com/email
>> ______________________________________________________________________
>> 
> 
> 
> 
> -- 
> Dr Simon Miles
> Lecturer, Department of Informatics
> Kings College London, WC2R 2LS, UK
> +44 (0)20 7848 1166
>
Received on Tuesday, 24 May 2011 22:35:11 UTC