Re: PROV-ISSUE-43 (derivation-time): Deriviation should have associated time [Conceptual Model] from Satya Sahoo on 2011-08-05 (public-prov-wg@w3.org from August 2011)

From: Satya Sahoo <satya.sahoo@case.edu>
Date: Fri, 5 Aug 2011 18:15:34 -0400
To: Luc Moreau <L.Moreau@ecs.soton.ac.uk>
Cc: Khalid Belhajjame <Khalid.Belhajjame@cs.man.ac.uk>, public-prov-wg@w3.org
Message-ID: <CAOMwk6yQNkOOwr4zw+42Jo==LV10rA65a2-2=hTrOQq-v9PFGA@mail.gmail.com>
Hi Luc,
Sorry about the delayed response.

>For me, the rule is really about an existential quantifier over r0, r1 and
pe.  They exist, but we don't know >which.  With further knowledge about the
system, we may or we may not be able to identify them. Of course, >there is
the case when we know them. In that case, we probably need a further
notation:
>isDerivedFrom(e1,e0,pe,r1,r0)  from which we can infer
isGeneratedBy(e1,pe,r1) and uses(pe,e0,r0). But to >me, the key rule is the
one with the existential quantifier.
So, first point we can conclude is that there is no inference rule involved
in the scenario we are discussing, we are discussing property restrictions
(e.g. existential quantifier).

I am not sure I understand your use of "existential quantifier" term above.
If we use the predicate logic notion of existential quantifier and we define
the existential restriction* *over the isDerivedFrom property to PE, then
every time we make an assertion using isDerivedFrom we will be required to
explicitly identify the PE instance (otherwise the existential restriction
will be false). This will prevent applications/users from using
isDerivedFrom property when they do not have the complete information, which
is often the case on the Web.

Alternately, if you mean that everytime we have an assertion using
isDerivedFrom then it is understood that there "some" PE that links the two
Bobs (with generation and use properties), then this will be covered by the
open world assumption. It will be up to the user/application to create
additional assertions with PE (and generation and use properties)
corresponding to the isDerivedFrom property (assuming the required
information is available to the user/application to make these assertions).


> I don't understand your sentence. What do you mean by temporal dimension?
Sorry, I meant temporal (time) value.


> Why can't an output be generated after an input was read, but without
causal link, simply by coincidence.  >Derivation would not hold then.
I am not sure what is meant by coincidence here?

Thanks.

Best,
Satya



On Wed, Jul 27, 2011 at 3:00 PM, Luc Moreau <L.Moreau@ecs.soton.ac.uk>wrote:

> **
>
> Hi Satya,
>
>
> On 27/07/11 18:07, Satya Sahoo wrote:
>
> Hi all,
> >A derivation, which by definition expresses that some characterized entity
> is transformed from, created from, or >affected by another characterized
> entity, entails a process execution that transforms, creates or affects this
> >characterized entity.
>
> >This is formalized by the following inference rule, referred to as process
> execution introduction:
> >if isDerivedFrom(e1,e0) holds, then there exists a process execution pe,
> and roles r0,r1, such that: >isGeneratedBy(e1,pe,r1) and uses(pe,e0,r0).
>
>  Again, I restate the issue - how do we know which "pe" unless the
> information is already asserted? If the information is already asserted, in
> a format that makes the correlation between e1, pe and e0 and pe explicit -
> then there is no requirement for the rule (my understanding is that rules
> are to make implicit knowledge explicit and not restate explicit knowledge
> already available).
>
>
> For me, the rule is really about an existential quantifier over r0, r1 and
> pe.  They exist, but we don't know which.  With further knowledge about the
> system, we may or we may not be able to identify them.
>
> Of course, there is the case when we know them. In that case, we probably
> need a further notation:
>
> isDerivedFrom(e1,e0,pe,r1,r0)  from which we can infer
> isGeneratedBy(e1,pe,r1) and uses(pe,e0,r0).
>
> But to me, the key rule is the one with the existential quantifier.
>
>
>
> >The converse inference does not hold. Indeed, when a generation
> isGeneratedBy(e1,pe,r1) precedes >uses(pe,e0,r0), for some e0, e1, r0, r1,
> and pe, one cannot infer derivation isDerivedFrom(e1,e0) since the values
> >of attributes of e1 cannot possibly be determined by the values of
> attributes of e0, given the creation of e1 >precedes the use of e0.
> I believe this is an incorrect version of my proposal, which assumed
> that uses(pe,e0,r0) precedes isGeneratedBy(e1,pe,r1).
>
>
> This was not intended to follow your proposal. It's a counter example to
> show that the converse rule does not hold.
>
>
>  If we consider the scenario where isGeneratedBy(e1,pe,r1) precedes
> uses(pe,e0,r0), then how can we infer that isDerivedFrom(e1,e0)
> entails isGeneratedBy(e1,pe,r1) and uses(pe,e0,r0), since we do not have any
> temporal dimension associated with any of the above assertions (as Paul had
> suggested)?
>
>  I don't understand your sentence. What do you mean by temporal dimension?
>
>
>
>  If we explicitly associate time with the
> assertions isGeneratedBy(e1,pe,t1), uses(pe,e0,t0) and t0<t1, then the
> alternate proposal I had suggested will hold in some cases (where there is a
> single input and single output of a process execution). But the original
> rule isDerivedFrom(e1,e0) :- isGeneratedBy(e1,pe,t1) and uses(pe,e0,t0) will
> still not hold.
>
>
> Even in the very special case you indicate, it's not obvious that this rule
> holds at all.
>
> Why can't an output be generated after an input was read, but without
> causal link, simply by coincidence.  Derivation would not hold then.
>
> Luc
>
>
>
>  Thanks.
>
>  Best,
> Satya
>
>
> On Wed, Jul 27, 2011 at 10:44 AM, Luc Moreau <L.Moreau@ecs.soton.ac.uk>wrote:
>
>>
>> Hi,
>>
>> The latest version of the document includes a new section that relates
>> derivation to process execution [1]. I copy the text here for convenience.
>> Note that this does not directly resolves this issue, but it provides us
>> with
>> a basis to discuss time in the context of derivation.
>>
>>
>> 5.5.1 Relationship between derivation and process execution
>>
>> A derivation, which by definition expresses that some characterized entity
>> is transformed from, created from, or affected by another characterized
>> entity, entails a process execution that transforms, creates or affects this
>> characterized entity.
>>
>> This is formalized by the following inference rule, referred to as process
>> execution introduction:
>> if isDerivedFrom(e1,e0) holds, then there exists a process execution pe,
>> and roles r0,r1, such that: isGeneratedBy(e1,pe,r1) and uses(pe,e0,r0).
>>
>> The converse inference does not hold. Indeed, when a generation
>> isGeneratedBy(e1,pe,r1) precedes uses(pe,e0,r0), for some e0, e1, r0, r1,
>> and pe, one cannot infer derivation isDerivedFrom(e1,e0) since the values of
>> attributes of e1 cannot possibly be determined by the values of attributes
>> of e0, given the creation of e1 precedes the use of e0.
>>
>> Cheers,
>>
>> Paolo and Luc
>>
>>
>> [1]
>> http://dvcs.w3.org/hg/prov/raw-file/default/model/ProvenanceModel.html#relationship-between-derivation-and-process-execution
>>
>>
>> On 07/27/2011 09:11 AM, Khalid Belhajjame wrote:
>>
>>
>> Hi Satya,
>>
>> On 26/07/2011 19:26, Satya Sahoo wrote:
>>
>> Hi Khalid,
>> >  No information about the process pe is inferred. The above merely
>> specifies that there exists a process >execution, (which we don't know),
>> such that isGeneratedBy(e1,pe,r1) and use(pe,e0,r0)
>> If we do not know about pe, then what new knowledge is being added to the
>> provenance store using the above rule?
>>
>>
>> I don't think that such a rule was suggested to infer new information. It
>> was merely used to clarify what the time t refers to in the assertion
>> isDerivedFrom(b1,b2,t), i.e., whether t refers to the time in which the
>> process execution that generates b2 use b1, or the time at which the process
>> in question generates b2.
>>
>> Thanks, khalid
>>
>>  The information that a pe may exist anyway follows from our 'open world
>> assumption'.
>>
>>  > IMO, we cannot make this inference. The process execution pe may well
>> generate e1 without using e0, even if >e0 is an input of that process
>> execution.
>> I agree with your point - there may be an indirect dependency between e1
>> and e0 (if pe cannot be executed without e0 being present). But, defining
>> the indirect dependency as the isGeneratedBy property may be inaccurate.
>>
>>  Thanks.
>>
>>  Best,
>> Satya
>>
>>
>> On Tue, Jul 26, 2011 at 4:26 AM, Khalid Belhajjame <
>> Khalid.Belhajjame@cs.man.ac.uk> wrote:
>>
>>>
>>> Hi Satya,
>>>
>>> On 26/07/2011 02:33, Satya Sahoo wrote:
>>>
>>> Hi Luc,
>>> >  I think there is a missing "inference" in the specification.
>>> >If there isDerivedFrom(e1,e0) holds, then there exists a process
>>> execution pe, and roles r0,r1, such that:
>>> >isGeneratedBy(e1,pe,r1) and use(pe,e0,r0)
>>>
>>>  I am not sure how can we infer additional information (pe, r0, r1) from
>>> limited information (e1, e0)? Did you mean, we have the information about
>>> pe, r0, r1, and the link between them and (e1, e0) already stored somewhere?
>>>
>>>
>>>  No information about the process pe is inferred. The above merely
>>> specifies that there exists a process execution, (which we don't know), such
>>> that isGeneratedBy(e1,pe,r1) and use(pe,e0,r0)
>>>
>>>
>>>
>>>  As an alternate, I think we can define the inference rule in the
>>> opposite direction:
>>> if there exists: isGeneratedBy(e1,pe,r1) and use(pe,e0,r0)
>>> then: isDerivedFrom(e1,e0) holds true?
>>>
>>>
>>>  IMO, we cannot make this inference. The process execution pe may well
>>> generate e1 without using e0, even if e0 is an input of that process
>>> execution.
>>>
>>> Thanks, khalid
>>>
>>>
>>>
>>>  Also, if we consider the above alternate version of the rule, we need
>>> to define whether isDerivedFrom "existentially dependent" on "isGeneratedBy"
>>> and "use" properties, in other words only if isGeneratedBy(e1,pe,r1) AND
>>> use(pe,e0,r0) already exist can we have isDerivedFrom(e1,e0)? Or,
>>> isDerivedFrom can be independently asserted?
>>>
>>>  Best,
>>> Satya
>>>
>>> On Mon, Jul 25, 2011 at 4:21 AM, Luc Moreau <L.Moreau@ecs.soton.ac.uk>wrote:
>>>
>>>>
>>>>
>>>> I'd like to refer to the missing inference I mentioned in a separate
>>>> thread:
>>>>
>>>> I think there is a missing "inference" in the specification.
>>>>
>>>> If there isDerivedFrom(e1,e0) holds, then there exists a process
>>>> execution pe, and roles r0,r1,
>>>> such that:
>>>>  isGeneratedBy(e1,pe,r1) and use(pe,e0,r0)
>>>>
>>>>
>>>> So, given isDerivedFrom(e1,e0), I would argue that there are potentially
>>>> four
>>>> notions of time associated with this derivation:
>>>> - beginning of pe
>>>> - end of pe
>>>> - use of e0
>>>> - generation of e1
>>>>
>>>> Paul, in your proposal, were you referring to any of these 4 instants,
>>>> or
>>>> did you have another notion of time not captured yet?
>>>>
>>>>
>>>> Luc
>>>>
>>>>
>>>>
>>>> On 07/24/2011 09:12 PM, Paul Groth wrote:
>>>>
>>>>> Something like that...I need to look at the exact definition of derived
>>>>> from.
>>>>>
>>>>> Paul
>>>>>
>>>>> On Jul 24, 2011, at 20:43, Khalid Belhajjame<
>>>>> Khalid.Belhajjame@cs.man.ac.uk>  wrote:
>>>>>
>>>>>
>>>>>
>>>>>>
>>>>>> Ok, I must admit I didn't understand that. Just to clarify, when one
>>>>>> say
>>>>>> isDerivedFrom(b1,b2,t), does that means that b2 was created at t?
>>>>>>
>>>>>> Thanks, khalid
>>>>>>
>>>>>>
>>>>>> On 24/07/2011 18:33, Paul Groth wrote:
>>>>>>
>>>>>>
>>>>>>> Hi Khalid,
>>>>>>>
>>>>>>> I don't think this is what I mean.
>>>>>>>
>>>>>>> It's not when the assertion was made. It's when the derivation
>>>>>>> occurred according to the asserter.
>>>>>>>
>>>>>>> Just as with use and generation. It's the time at which these events
>>>>>>> occur according to the asserter.
>>>>>>>
>>>>>>> Thanks
>>>>>>> Paul
>>>>>>>
>>>>>>> On Jul 24, 2011, at 18:08, Khalid Belhajjame<
>>>>>>> Khalid.Belhajjame@cs.man.ac.uk>   wrote:
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>> On 24/07/2011 15:35, Myers, Jim wrote:
>>>>>>>>
>>>>>>>>
>>>>>>>>> (The time is not the interval over which the derivation relation is
>>>>>>>>> valid - in the same way the time on USED is not the time when that
>>>>>>>>> relation is valid (it would be if the semantics were 'in use during
>>>>>>>>> interval t') - both just describe the time when an enduring
>>>>>>>>> relationship
>>>>>>>>> was first formed.)
>>>>>>>>>
>>>>>>>>>
>>>>>>>> Agreed, that what I was hinting to in my last response email to
>>>>>>>> Paul.
>>>>>>>> The time I was referring to in my email was the validity, but Paul,
>>>>>>>> I
>>>>>>>> think, was talking about the time where the derivation was formed.
>>>>>>>>
>>>>>>>> Which leads me to a new proposal. Instead of having the time as
>>>>>>>> argument
>>>>>>>> to USE, GENERATION and derivation, e.g., isDerivedFrom(b1,b2,t).
>>>>>>>> Would
>>>>>>>> it be sensible to assume, instead, that every assertion may be
>>>>>>>> associated with a time in which it was formed?
>>>>>>>>
>>>>>>>> Thanks, Khalid
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>>>  Jim
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> -----Original Message-----
>>>>>>>>>> From: public-prov-wg-request@w3.org [mailto:public-prov-wg-
>>>>>>>>>> request@w3.org] On Behalf Of Khalid Belhajjame
>>>>>>>>>> Sent: Sunday, July 24, 2011 8:27 AM
>>>>>>>>>> To: Paul Groth
>>>>>>>>>> Cc: Provenance Working Group WG; Provenance Working Group Issue
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>> Tracker
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> Subject: Re: PROV-ISSUE-43 (derivation-time): Deriviation should
>>>>>>>>>> have
>>>>>>>>>> associated time [Conceptual Model]
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Hi Paul,
>>>>>>>>>>
>>>>>>>>>> On 24/07/2011 13:13, Paul Groth wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> Hi Khalid
>>>>>>>>>>> But why can't I say that a newspaper article is derived from a
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>  picture at a
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> particular time? Or for that matter over a period of time.
>>>>>>>>>>
>>>>>>>>>> The way I see it, is that there will be a bob representing the
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>> newspaper article
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> and another representing the picture. If there is evidence that
>>>>>>>>>> the
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>> latter is
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> derived from the former, then the derivation will always hold
>>>>>>>>>> between
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>> those
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> two bobs.
>>>>>>>>>>
>>>>>>>>>> Now, that I am writing this email, I am wondering whether we are
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>> referring to
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> the same notion of time. In your statement,
>>>>>>>>>> isDerivedFrom(b1,b2,t), I
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>> think you
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> mean t is used to refers to the time in which the derivation
>>>>>>>>>> assertion
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>> was
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> made, whereas what I was thinking of is the (period of) time in
>>>>>>>>>> which
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>> the
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>> derivation holds. Is that the case?
>>>>>>>>>>
>>>>>>>>>> Thanks, khalid
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> The time is when the derivation occurred not when it applies.
>>>>>>>>>>>
>>>>>>>>>>> Thanks
>>>>>>>>>>> Paul
>>>>>>>>>>>
>>>>>>>>>>> On Jul 24, 2011, at 13:06, Khalid
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>> Belhajjame<Khalid.Belhajjame@cs.man.ac.uk>     wrote:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> Hi Paul,
>>>>>>>>>>>>
>>>>>>>>>>>> I think that "Use" and "Generation" should be associated with
>>>>>>>>>>>> time.
>>>>>>>>>>>> However, I don't think we should associate time to derivation.
>>>>>>>>>>>> I would argue that isDerivedFrom(b1,b2) holds all time. Although
>>>>>>>>>>>> b1
>>>>>>>>>>>> and
>>>>>>>>>>>> b2 may no longer exist, isDerivedFrom(b1,b2) is still valid.
>>>>>>>>>>>>
>>>>>>>>>>>> Thanks, khalid
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>> On 23/07/2011 16:46, Provenance Working Group Issue Tracker
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>>>>> PROV-ISSUE-43 (derivation-time): Deriviation should have
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>   associated
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>>  time [Conceptual Model]
>>>>>>>>>>>>>
>>>>>>>>>>>>> http://www.w3.org/2011/prov/track/issues/43
>>>>>>>>>>>>>
>>>>>>>>>>>>> Raised by: Paul Groth
>>>>>>>>>>>>> On product: Conceptual Model
>>>>>>>>>>>>>
>>>>>>>>>>>>> Other relationships have time associated with them (e.g. use,
>>>>>>>>>>>>> generation, control)
>>>>>>>>>>>>>
>>>>>>>>>>>>> There is no optional time associated with derivation.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Suggested resolution is to add the following to the definition
>>>>>>>>>>>>> of
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>  isDerivedFrom:
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>>  -  May contain a "derived from time" t, the time or time
>>>>>>>>>>>>> intervals
>>>>>>>>>>>>> when b1 was derived from b2
>>>>>>>>>>>>>
>>>>>>>>>>>>> Example:
>>>>>>>>>>>>> isDerivedFrom(b1,b2, t)
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>
>>>>>
>>>>>
>>>>
>>>>   --
>>>> Professor Luc Moreau
>>>> Electronics and Computer Science   tel:   +44 23 8059 4487
>>>> University of Southampton          fax:   +44 23 8059 2865
>>>> Southampton SO17 1BJ               email: l.moreau@ecs.soton.ac.uk
>>>> United Kingdom                     http://www.ecs.soton.ac.uk/~lavm
>>>>
>>>>
>>>>
>>>
>>>
>>
>>
>> --
>> Professor Luc Moreau
>> Electronics and Computer Science   tel:   +44 23 8059 4487
>> University of Southampton          fax:   +44 23 8059 2865
>> Southampton SO17 1BJ               email: l.moreau@ecs.soton.ac.uk
>> United Kingdom                     http://www.ecs.soton.ac.uk/~lavm
>>
>>
>
Received on Friday, 5 August 2011 22:16:19 UTC