RE: PROV-ISSUE-7 (define-derivation): Definition for Concept 'Derivation' [Provenance Terminology] from Myers, Jim on 2011-06-03 (public-prov-wg@w3.org from June 2011)

From: Myers, Jim <MYERSJ4@rpi.edu>
Date: Fri, 3 Jun 2011 08:43:48 -0400
To: Khalid Belhajjame <Khalid.Belhajjame@cs.man.ac.uk>
CC: Graham Klyne <GK@ninebynine.org>, Luc Moreau <L.Moreau@ecs.soton.ac.uk>, <public-prov-wg@w3.org>
Message-ID: <B7376F3FB29F7E42A510EB5026D99EF2051E485E@troy-be-ex2.win.rpi.edu>
Khalid,
I think the answer should be obviously yes (I haven't thought of any
examples where I'd want 'no') - if some state of a thing (r2_s) is used
in a process instance to change another thing (r1) into some state
(r1_s) then r1 is 'derived' from r2 *after the process instance occurs*.
In reality, I think many of our examples have this sense (or could have)
- we might create a file (r1) and then copy/paste gov. data in (new
state or r1) and claim r1 depends on the gov't data after that when the
file in its original state did not. Beyond the OPM -style set of
relations, I think the only addition needed to handle this is some way
to connect a thing and an aspect of the thing capturing its state (and I
would claim it might be that r2_s is not the complete state, just more
than r2, so we should think about capturing the relationship 'more
stateful representation of' rather than trying to define two types
absolutely).

 Jim

-----Original Message-----
From: Khalid Belhajjame [mailto:Khalid.Belhajjame@cs.man.ac.uk] 
Sent: Friday, June 03, 2011 8:15 AM
To: Myers, Jim
Cc: Graham Klyne; Luc Moreau; public-prov-wg@w3.org
Subject: Re: PROV-ISSUE-7 (define-derivation): Definition for Concept
'Derivation' [Provenance Terminology]


Hi Jim,

On 03/06/2011 02:03, Myers, Jim wrote:
> What do you want to capture with derivation of mutable resources?
Simply that one mutable resource can be used in a process and produce
another different mutable resouirce? If so, I'd ask why we should
consider this case any different than immutable?
To me there are issues that come out when considering the derivation of
mutable resources. To illustrate this, I will give a simple example. 
Consider a mutable resource r1, which was in a state r1_s1, and consider
a process execution that used as input a resource r2 in a state  r2_s to
transform the state r1_s1 to the state r1_s2.

Given the above, we can say that r2_s contributed to the derivation of
r1_s2.

Now, answering the question "did the resource r2 contribute to the
derivation of the resource r1" is not obvious, or is it?

PS: Sorry Luc, I know that we probably should stop talking about this,
given that in yestreday's telecon we agreed that we will consider
"immutable things" or "things/values" that are immutable according to
some viewpoint as you suggested.

Thanks, khalid

> (Does the fact that most of what we want to call immutable resources 
> are undergoing constant change (bits getting refresh charges, files 
> moving about in memory caches, etc.) cause any issue with the basic 
> OPM-style model? I think all of these cases are handled just fine by 
> OPM-style constructs and I'd argue further that the key concept about 
> artifacts was not complete immutability with respect to any process we

> can think of but immutability with respect to the processes involved 
> in the provenance (Eggs used in cake baking do not come out as 
> modified eggs (they become a new cake), but an egg in the fridge and 
> the warmer egg waiting to be mixed are considered the same egg only 
> because we don't want to discuss/report on the warming process that 
> occurred. The fact that an egg has mutability in its temperature 
> doesn't make it a bad artifact in OPM or cause trouble in reporting a 
> baking process...)
>
> The mutable case that presents a question is should we provide a
second mechanism to allow one to describe a process that changes the
state of a mutable resource?-to say that  egg with temperaturcold is the
same egg with temperature warm after a heating process. I suspect that
we can't avoid this use case completely but we might not have to create
a separate mechanism: If we allow a resource egg to be associated with
cold-egg and warm-egg resources, we can use the OPM like mechanism
(cold-egg<-- heating<-- warm-egg) while adding cold-egg and warm-egg are
'aspectsof" the same mutable egg which 'participates' in a heating
process. I think this is general and minimally disruptive. One could say
that an egg participated in heating without creating other resources,
but one could not directly describe the temperature of the egg before
and after heating without creating the cold and warm egg artifacts.
>
>   I think this also covers what we want from agents and sources - we
want to convey that they participate in a process and, while their state
changes as they do so, we don't want to document their state changes.
But as Simon says we may still want to treat them (e.g. the Royal
Society) as resources and talk about their creation so it would be
valuable if they could just be artifacts in the context of
creation/founding type events. Today, we have agents and sources as
different types than artifact so there is no way to talk about their
founding, etc.
>
> --  Jim
>
>
>
> ________________________________
>
> From: public-prov-wg-request@w3.org on behalf of Graham Klyne
> Sent: Thu 6/2/2011 3:45 PM
> To: Khalid Belhajjame
> Cc: Luc Moreau; public-prov-wg@w3.org
> Subject: Re: PROV-ISSUE-7 (define-derivation): Definition for Concept 
> 'Derivation' [Provenance Terminology]
>
>
>
> Khalid Belhajjame wrote:
>> Hi Graham,
>>
>>   >I agree that many of the examples of derivation we have raised 
>> relate to resource states.  But if, as has been suggested by myself 
>> and others, resource states are themselves resources>(especially when

>> named for the purposes of expressing a derivation), then such 
>> derivations can equally be regarded as relating resources.  I think 
>> that's more a difference of terminology than>fundamental.
>>
>> Would it be fair then to say that in that view resources are 
>> immutable resources?
> In the case of resources representing a snapshot of state, yes.
>
>> Which bring me to the question, do we want to express derivations 
>> between mutable resources, or that is just something that we should 
>> avoid at this point?
> (I'm finishing this email after today's telecon, so it's a bit of a 
> re-run.)
>
> I think that many of our use-cases are based on invariant values, and 
> the near-term goal is to find expression for these.  So we definitely 
> do want to express derivations between non-varying values.  But in so 
> doing, it's not clear to me (yet) that we need to exclude mutable 
> resources, so I say let's keep our options open and not close off any
possibilities that we don't have to.
>
> So my answer to avoiding mutable resources is: "yes and no".
>
> #g
> --
>
>
>> Thanks, khalid
>>
>>> Where I think I may diverge from what you say is that I would not 
>>> limit such expressions of derivation to resources that happen to be 
>>> a state (or snapshot of state) of some resource.  I think defining 
>>> that distinction in a hard-and-fast way, that also aligns with 
>>> various intuitions we may have about derivation, may prove difficult

>>> to achieve (e.g. as I think is suggested by Jim Meyers in 
>>> http://lists.w3.org/Archives/Public/public-prov-wg/2011Jun/0015.html
>>> (*)).
>>>
>>> #g
>>> --
>>>
>>> (*) I just love the W3C mailing list archives - so easy to find 
>>> links to messages, and thus capture provenance!
>>>
>>> Khalid Belhajjame wrote:
>>>> Hi,
>>>>
>>>>   From the discussion so far on derivation it seems that most 
>>>> people tend to define derivation between resource states or 
>>>> resources state representations, but not for resources.
>>>>
>>>> My take on this is that in a context where a resource is mutable, 
>>>> derivations will mainly be used to associate resource states and 
>>>> resource states representations.
>>>>
>>>> That said, based on derivations connecting resource states and 
>>>> resources state representations, one can infer new derivations 
>>>> between resources. For example, consider the resource r_1 and the 
>>>> associated resource state r_1_s, and consider that r_1_s was used 
>>>> to construct a new resource state r_2_s, actually the first state, 
>>>> of the resource r2. We can state that r_2_s is derived from r_1_s, 
>>>> i.e., r_1_s ->  r_2_s. We can also state that the resource r_2 is 
>>>> derived from the resource r_1, i.e., r_1 ->  r_2
>>>>
>>>> PS: I added a defintiion of derivation within this lines to the
wiki:
>>>> http://www.w3.org/2011/prov/wiki/ConceptDerivation
>>>>
>>>> Thanks, khalid
>>>>
>>>>
>>>>
>>>>
>>>> On 01/06/2011 07:49, Luc Moreau wrote:
>>>>> Hi Graham,
>>>>>
>>>>> Isn't it that you used the duri scheme to name the two resource 
>>>>> states that exist in this scenario?
>>>>>
>>>>> In your view of the web, is there a notion of stateful resource?
>>>>> Does it apply here?
>>>>>
>>>>> Thanks,
>>>>> Luc
>>>>>
>>>>>
>>>>>
>>>>> On 31/05/11 23:57, Graham Klyne wrote:
>>>>>> Luc Moreau wrote:
>>>>>>> Graham,
>>>>>>>
>>>>>>> In my example, I really mean for the two versions of the chart 
>>>>>>> to be available at the same URI. (So, definitely, an uncool 
>>>>>>> URI!)
>>>>>>>
>>>>>>> In that case, there is a *single* resource, but it is stateful.
>>>>>>> Hence, there
>>>>>>> are two *resource states*, one generated using (stats2), and the

>>>>>>> other using (stats3).
>>>>>> Luc,
>>>>>>
>>>>>> I had interpreted your scenario as using a common URI as you
explain.
>>>>>>
>>>>>> But there are still several resources here, but they are not all 
>>>>>> exposed on the web or assigned URIs.  I'm appealing here to 
>>>>>> anything that *might* be identified as opposed to things that
>>>>>> actually are assigned URIs.   (For example, the proposed duri:
>>>>>> scheme might be used -
>>>>>> http://tools.ietf.org/id/draft-masinter-dated-uri-07.html)
>>>>>>
>>>>>> (And the URI is perfectly "cool" if it is specifically intended 
>>>>>> to denote a dynamic resource.  A URI used to access the current 
>>>>>> weather in London can be stable if properly managed.)
>>>>>>
>>>>>> (I think this is all entirely consistent with my earlier stated
>>>>>> positions.)
>>>>>>
>>>>>> #g
>>>>>> --
>>>>>>
>>>>>>> Of course, if blogger had used cool uris, then, c2s2 and c2s3 
>>>>>>> would be different resources.
>>>>>>>
>>>>>>> Luc
>>>>>>>
>>>>>>> On 05/31/2011 02:25 PM, Graham Klyne wrote:
>>>>>>>> I see (at least) two resources associated with (c2):  one 
>>>>>>>> generated using (stats2), and other using (stats3).  We might 
>>>>>>>> call these (c2s2) and (c2s3).
>>>>>
>>>
>>
>
>
>
>
>
Received on Friday, 3 June 2011 12:45:26 UTC