Re: PROV-ISSUE-67 (single-execution): Why is there a difference in what is represented by one vs multiple executions? [Conceptual Model]

Hi Jim,

I think we share a different understanding of isDerivedFrom.

For me, fully expressed, isDerivedFrom should be:

isDerivedFrom(e2, e1, pe, r2, r1)

meaning that e2 was derived by e1, by means of a process
execution pe, which generated e2 with role r2, and used e1 with role r1.

In its simplified form,
isDerivedFrom(e2, e1)
implies that there exists a pe and roles r2 and r1, such that
isDerivedFrom(e2, e1,pe,r2,r1)
but pe,r2,r1 may not have been asserted.

It's not that pe is atomic or not. It's that there is a tight link between
the derivation and  the process execution.

isDerivedFromInMultipleSteps is silent about that link.

I am not trying to infer derivation beyond transitive closures.


On 01/08/11 16:54, Myers, Jim wrote:
> Luc,
> If we cannot tell if PEs are atomic, isDerivedFrom and
> isDerivedFromInMultipleSteps would appear to be synonyms - one can't
> make one transitive and the other not if you have no guarantee that what
> was reported as one PE for isDerivedFrom cannot be multiple PEs.
> Transitivity should not be a function of how the witness reports the PE.
> Said differently, I think the discussion about multiple steps is an
> attempt to get back to the fact that derivedFrom could be defined two
> ways if we don't know about the nature of a PE -  one that is trivial
> based on the used/generated relationships, which would be transitive,
> and one that is only assertable, which can only be transitive iff PEs
> are atomic and Bobs are atomic.  (So really we need
> isDerivedFromInMultipleStepsAndOrFromAggregateObjects()). The latter is
> really trying to get at the idea that something in the output came
> directly from the input, e.g. some physical material has been
> incorporated.
> I think it would be useful to know if the group thinks we need both. If
> so, I would suggest having names for them that don't involve discussion
> of multi-step or aggregate Bobs if we don't want to define atomicity.
> Maybe inheritsFrom (assertable) and derivedFrom(transitive)?
> IncorporatedFrom()?
>   Jim
>> -----Original Message-----
>> From: [mailto:public-prov-wg-
>>] On Behalf Of Luc Moreau
>> Sent: Monday, August 01, 2011 11:28 AM
>> To:
>> Subject: Re: PROV-ISSUE-67 (single-execution): Why is there a
> difference in
>> what is represented by one vs multiple executions? [Conceptual Model]
>> Hi Jim,
>> The text does not mention atomic processes, and there was (in my mind)
> no
>> intent of having them in the model.
>> As I asked Simon, could you explain with an example in the context of
> the
>> document
>>    what the problem is.
>> Thanks,
>> Luc
>> On 01/08/11 15:31, Myers, Jim wrote:
>>> +1 - there are very few 'atomic' processes that could not be
> described as an
>> aggregate graph of other processes. Given that we don't know anything
> in PIL
>> about the nature of processes, it seems like distinguishing direct
> versus
>> multiple will not be a clear binary split and we'd essentially end up
> treating
>> both the same way.
>>>    Jim
>>>> -----Original Message-----
>>>> From: [mailto:public-prov-wg-
>>>>] On Behalf Of Simon Miles
>>>> Sent: Monday, August 01, 2011 7:03 AM
>>>> To: Provenance Working Group WG
>>>> Subject: Re: PROV-ISSUE-67 (single-execution): Why is there a
>>>> difference in what is represented by one vs multiple executions?
>>>> [Conceptual Model]
>>>> Hi Luc,
>>>> I follow your argument, but it seems tangential to my point. The
>>>> following argument still seems inevitably true to me:
>>>> Activity in the world that uses one BOB and generates another *can*
>>>> be described in PIL as multiple process executions or a single
>>>> process execution (regardless of whether it actually is described
> in
>>>> these different ways or not, or whether accounts are required or
> not).
>>>> Therefore, what one process execution denotes is not distinct from
>>>> what multiple process executions denotes, we have just provided
> more
>>>> detail in the latter description (and this detail is, in any case,
>>>> removed when saying "is derived from").
>>>> Therefore, isDerivedFrom and isDerivedFromInMultipleSteps as
> defined
>>>> do not describe anything different in the world, so we have two
> terms
>>>> for representing the same thing.
>>>> I know that we've debated this or similar before, but it is still
> not
>>>> clear to me where the fault lies in my argument, or what
>>>> isDerivedFromInMultipleSteps really represents. If it's only me
>>>> that's confused, I understand there are more urgent concerns
> (though
>>>> I'd still like to understand).
>>>> Thanks,
>>>> Simon
>>>> On 1 August 2011 09:25, Luc Moreau<>
>> wrote:
>>>>> Hi Simon,
>>>>> If I understand you correctly, you are suggesting that the
> following
>>>>> two assertions hold together.
>>>>> isGeneratedBy(e5,pe5,out)
>>>>> isGeneratedBy(e5,pe4,out)
>>>>> But this is not legal, since it is stated that one BOB is
> generated
>>>>> by at most one process execution.
>>>>> What you are suggesting should be encoded in a separate account
>>>>> (though we have not defined this yet!).
>>>>> A one-step derivation then expands to one process execution in a
>>>>> given account.
>>>>> In a separate account, there may be a multi-step derivation
> between
>>>>> the same two BOBs and it would expand into multiple process
>>>>> executions.
>>>>> Does it make sense?
>>>>> Regards,
>>>>> Luc
>>>>> On 07/29/2011 05:52 PM, Provenance Working Group Issue Tracker
>> wrote:
>>>>>> PROV-ISSUE-67 (single-execution): Why is there a difference in
> what
>>>>>> is represented by one vs multiple executions? [Conceptual Model]
>>>>>> Raised by: Simon Miles
>>>>>> On product: Conceptual Model
>>>>>> By the definition, "a process execution represents an
> identifiable
>>>> activity". This does not seem to preclude one process execution
>>>> assertion denoting, at a coarse granularity, the same events in the
>>>> world denoted by multiple process executions in other assertions.
>>>>>> If so, then in the File Scenario example, I could add a
>>>>>> coarse-grained
>>>> process execution representing the whole e1-to-e5 activity:
>>>>>>      processExecution(pe5,collaboratively-edit,t)
>>>>>>      uses(pe5,e1,in)
>>>>>>      isGeneratedBy(e5,pe5,out)
>>>>>> But then Section 5.5.2 distinguishes between "a single process
>> execution"
>>>> and "one or more process executions". Following the argument above,
>>>> these could represent exactly the same occurrences in the world.
>>>>>> So there is no difference between what is denoted by one and
>>>>>> multiple
>>>> process executions, and so no difference between isDerivedFrom and
>>>> isDerivedFromInMultipleSteps as described. Whether e5 was derived
>>>> from
>>>> e1 appears to me to be entirely independent of how many process
>>>> executions were involved.
>>>>> --
>>>>> Professor Luc Moreau
>>>>> Electronics and Computer Science   tel:   +44 23 8059 4487
> University
>>>>> of Southampton          fax:   +44 23 8059 2865 Southampton SO17
>>>> 1BJ
>>>>> email: United Kingdom
>> ____________________________________________________________
>> __
>>>> ________
>>>>> This email has been scanned by the MessageLabs Email Security
> System.
>>>>> For more information please visit
>> ____________________________________________________________
>> __
>>>> ________
>>>> --
>>>> Dr Simon Miles
>>>> Lecturer, Department of Informatics
>>>> Kings College London, WC2R 2LS, UK
>>>> +44 (0)20 7848 1166

Received on Monday, 1 August 2011 23:06:34 UTC