Re: PROV-ISSUE-50 (Ordering of Process): Defintion for Ordering of Process [Conceptual Model] from Satya Sahoo on 2012-01-11 (public-prov-wg@w3.org from January 2012)

From: Satya Sahoo <satya.sahoo@case.edu>
Date: Wed, 11 Jan 2012 13:03:12 -0500
To: Luc Moreau <L.Moreau@ecs.soton.ac.uk>
Cc: public-prov-wg@w3.org
Message-ID: <CAOMwk6w13ShP++69ks48KyFDzU6CXzo7xv15t-0ZQc=ZG1UbNQ@mail.gmail.com>
Hi Luc,
Since the points raised in this is issue have been superseded by updates to
DM, I am comfortable in closing this issue.

Thanks.

Best,
Satya

On Wed, Nov 30, 2011 at 3:45 AM, Luc Moreau <L.Moreau@ecs.soton.ac.uk>wrote:

> **
> Hi Satya,
> The discussion on this thread has not progressed since earlier October.
>
> The latest WD contains a new relation wasStartedBy between activities,
> which is
> simpler than wasScheduledAfter.
>
> For the second time, I am proposing to formally close this issue.
>
> Best regards,
> Luc
>
>
> On 10/03/2011 08:05 AM, Luc Moreau wrote:
>
> Hi Satya,
>
> Responses interleaved.
>
> On 03/10/11 01:54, Satya Sahoo wrote:
>
> Hi Luc,
> My comments are inline:
> >First, you will note that wasInformedBy is *not* a temporal relation
> between process executions.
>
>  The PROV-DM currently defines the following constraint for wasInformedBy:
> Given two process execution expressions denoted by pe1 and pe2, the
> expression wasInformedBy(pe2,pe1) holds, if and only if there is an
> entity expression denoted by e and qualifiers q1 and q2, such that
> wasGeneratedBy(e,pe1,q1) and used(pe2,e,q2) hold.
>
> If we consider the two expressions wasGeneratedBy(e, pe1, q1) and
> used(pe2, e, q2) - these two expressions together enforce that pe2 cannot
> have start time that is "before" start time of pe1. This is temporal
> relation/ordering between pe1 and pe2. Hence, if both these expressions
> have to "hold" for wasInformedBy(pe2, pe1) to "hold" I am not sure how it
> is not a temporal ordering?
>
>
> I agree that some temporal constraints have to be satisfied for
> wasInformedBy(pe2, pe1), but it's a necessary condition,
> it's not a sufficient condition.  Information (represented as entity e
> above) is required to flow between process executions.
>
> Also, it's not a temporal order, but it's a temporal relation!  It is not
> transitive!
>
> For these reasons (information flow and non transitivity), I feel that
> wasInformedBy does not fall under
> your temporal ordering classification.
>
>
>
> >Second, it would be nice for PROV to have a temporal ordering relation.
> However, we have to be
> >careful. The relations used/generatedBy/derivedFrom/dependedOn/... all
> have a notion of >causality/influence: the source of the edge being
> influenced by the edge destination.
> >We know that causal order implies temporal order, but not the converse.
> I am therefore reluctant
> >to introduce a relation that arbitrarily capture  temporal order.  What
> would it give us? After all,
> >we can associate time with PEs, and given such time information, we can
> already decide if pe1 >start precedes pe2 start, or if pe1 end precedes pe2
> start. What would a temporal relation give us >over time?
> There are many non-causal properties that are part of provenance
> assertions.
>
>  For example, to reconstruct the history of activities of an accused
> person X on Oct 2 before the X reached the crime scene, the police make the
> following assertions:
> 1. X bought a car at 2:00pm US ET - buying the car is PE pe1
> 2. X bought flowers at 4:00pm US ET- buying flowers is PE pe2
> 3. X hailed a taxi and travelled to crime scene at 6:00pm US ET -
> travelling in taxi is PE pe3
>
>
> This is  nice example where wasScheduledAfter can be used!
>
>
> In the above scenario, the police need to have temporal ordering of PEs to
> establish that person X was in the city on the day of the crime but there
> is no causal relation between pe1, pe2, and pe3.
>
>
> There is some underpinning ordering, since there is X at 2pm, X at 4pm,
> and X at 6pm.
> This is exactly the definition of wasScheduledAfter.
>
>
> As you stated, temporal ordering may or may not represent causal relation
> between PEs and since non-causal ordering of PEs occur in many provenance
> applications we need to define a property for temporal ordering of PEs and
> causality-based temporal ordering is a specialization of that property.
>
>
> >The relation wasScheduleAfter attempts to capture some temporal ordering,
> with underpinning
> >causal influence.  You are incorrect to state that to assert
> wasScheduledAfter you need to know >of an agent. It's exactly the contrary.
> By asserting wasScheduledAfter, you also assert the >existence of such
> an agent, but don't have to specify which it is.
>
> The PROV-DM currently defines the following constraint
> for wasScheduledAfter:
> Given two process execution expressions denoted by pe1 and pe2, the
> expression wasScheduledAfter(pe2,pe1) holds, if and only if there are two
> entity expressions denoted by e1 and e2, such that
> wasControlledBy(pe1,e1,qualifier(role="end")) and
> wasControlledBy(pe2,e2,qualifier(role="start")) and wasDerivedFrom(e2,e1)
> .
> and
> This definition assumes that the activities represented by process
> execution expressions identified by pe1 and pe2 are controlled by some
> agents, represented by expressions identified by e1 and e2, where the
> first agent terminates (control qualifier qualifier(role="end")) the
> first activity, and the second initiates (control qualifier
> qualifier(role="start")) the second. The second agent being "derived"
> from the first enforces temporal ordering. If we don't know which are the
> Agents associated with pe1 and pe2 then how can we state that they are
> entities with identifiers e1 and e2?
>
>  In other words, if there are two PEs (from Taverna workflows) -
> retrieveGeneSequence and runBLASTService and John (the research robot)
> ended retrieveGeneSequence and Tom (the research robot - derived from John)
> started runBLASTService - then we can assert that runBLASTService
> wasScheduledAfter retrieveGeneSequence.
>
>  But, if don't know which Agents are associated with retrieveGeneSequence
> and runBLASTService PEs then how can we assert wasScheduledAfter property
> between the two PEs?
>
>
> You will note that the constraint you copied contains "if and only if", so
> it is defining the expression wasScheduledAfter(pe2,pe1).
> It is therefore fine to assert it. The existential quantifier states the
> existence of agents, but when asserting wasScheduledAfter
> you don't need to know their identity. Vice-versa, if you know them and
> all other constraints are satisfied, than you can infer
> a WasScheduledAfter expression.
>
>
>  There maybe a third robot Albert and it is not related to either Tom or
> John by wasDerivedFrom property. But, a provenance application has to know
> which of three robots (agents) are associated with the two PEs (and then
> verify that there is a wasDerivedFrom property linking the two robots).
>
>  The constraint defined for wasScheduledAfter is a rule and for the rule
> to "fire" its conditions have to evaluate to "true".
>
>  Just knowing that there exist some Agent associated
> with retrieveGeneSequence and runBLASTService PEs will not make the
> constraint evaluate to "true" - the provenance application has to specify
> which Agents (John and Tom) were associated with the two PEs.
>
>  Hence, according to the current PROV-DM text, my understanding is that a
> provenance application will need to know about the specific agents
> associated with PEs before they can use the wasScheduledAfter property.
> This information may or may not be available to a provenance application.
>
>  Therefore I am raising the need for a generic ordering property for PEs
> that can be simply asserted by provenance applications. Similar to other
> provenance assertions the ordering of PEs can be verified later using
> either timestamps or causal relations constraints.
>
>
> You have not answered my point. What does this give you that you can't
> infer from time information?
>
>
>  >Final point, your reference [1] had not been agreed, it is the proposal
> you made back then.
> Hence, I had raised this issue (Issue-50) to discuss the property. To
> clarify, has there been discussions or agreement on the two properties
> isInformedBy and wasScheduledAfter (I may have missed the particular mails
> in the mailing list)?
>
>
> To my knowledge, this thread is the only one discussing these issues.  As
> Paul indicated a while back, the proposal
> is aligned with the rest of the document.
>
> I would like to see you putting a proper definition of the concept you
> would have in mind.  I would argue
> that your original text in [1] is not a definition but a requirement to be
> satisfied. Can you define this notion of temporal order in
> terms of the other "building block" of PROV (e.g. process start/end etc).
>
> Ultimately, we could introduce Allen's relations (
> http://en.wikipedia.org/wiki/Allen's_Interval_Algebra)
> but I am not sure it would be helpful in this context.
>
> Cheers,
> Luc
>
>
> Thanks.
>
>  Best,
> Satya
>
> On Sun, Oct 2, 2011 at 9:58 AM, Luc Moreau <L.Moreau@ecs.soton.ac.uk>wrote:
>
>> Hi Satya,
>>
>> First, you will note that wasInformedBy is *not* a temporal relation
>> between process executions.
>> It is *not* transitive.  It requires information to flow between two
>> PEs.  For wasInformedBy(pe1,pe2),
>> a minimum constraint is that the end of pe2 does *not* precede the start
>> of pe1.
>> The data journalism example had an illustration of such relation. It has
>> been established to be useful
>> both theoretically and practically.
>>
>> Second, it would be nice for PROV to have a temporal ordering relation.
>> However, we have to be
>> careful. The relations used/generatedBy/derivedFrom/dependedOn/... all
>> have a notion of causality/influence:
>> the source of the edge being influenced by the edge destination.
>>
>> We know that causal order implies temporal order, but not the converse.
>> I am therefore reluctant
>> to introduce a relation that arbitrarily capture  temporal order.  What
>> would it give us? After all,
>> we can associate time with
>> PEs, and given such time information, we can already decide if pe1 start
>> precedes pe2 start, or if pe1 end
>> precedes pe2 start. What would a temporal relation give us over time?
>>
>> The relation wasScheduleAfter attempts to capture some temporal ordering,
>> with underpinning
>> causal influence.  You are incorrect to state that to assert
>> wasScheduledAfter you need to know of an agent.
>> It's exactly the contrary. By asserting wasScheduledAfter, you also
>> assert the existence of such an
>> agent, but don't have to specify which it is.
>>
>> Final point, your reference [1] had not been agreed, it is the proposal
>> you made back then.
>>
>> So, in conclusion:
>> 1. I would argue that wasInformedBy is useful, and should be kept as
>> such, ... and definitely cannot
>>    be subsumed by some temporal ordering.
>>
>> 2. Temporal ordering *with* some form of underpinning causal influence,
>> is also useful. I agree that
>>    wasScheduledAfter is a first attempt. Maybe somebody can put forward
>> alternative definitions.
>>
>> Cheers,
>> Luc
>>
>>
>> On 02/10/11 02:03, Satya Sahoo wrote:
>>
>> Hi Luc,
>> I would like to re-raise this issue since the two properties defined in
>> PROV-DM, "wasInformedBy" and "wasScheduledAfter" do not represent the
>> original property for ordering process executions that was agreed to by the
>> provenance incubator group and also during the first F2F [1].
>>
>>  I believe there are primarily two dimensions/constraints for ordering
>> process executions:
>> a) Two PEs are scheduled (by agent/user) to execute in particular order
>> at specific time instants, which we can represent as *time-based
>> ordering of PEs*. Of course, additional information about which
>> agent/user started or stopped the PEs can be specified, but the time value
>> primarily define the ordering of the PEs.
>>
>>  b) A PE pe1 is designed to initiate/start a second PE pe2 (due to some
>> condition being satisfied for example a specific state was reached or some
>> entity became available), which we can represent as a *control-based
>> ordering of PEs*. This ordering of process cannot be effectively
>> captured by time-based ordering, since pe1 may still be executing while pe2
>> starts.
>>
>>  Both these cases are captured by the property "wasPrecededBy" (the
>> corresponding property in opposite direction can be "wasSucceededBy") where
>> the PEs were ordered according to their time of start/stop or explicit
>> start/stop by another PE.
>>
>>  Some specific comments on the current PROV-DM document Section 5.3.6
>> Ordering of Process Executions
>> =====
>> 1. An information flow ordering expression is a representation that a
>> characterized thing was generated by an activity, represented by a process
>> execution expresion, before it was used by another activity, also
>> represented by a process execution expression.
>>
>>  Issue: This is a particular case of "time-based ordering", there can
>> multiple others. For example,
>>
>>  a) We can have the provenance assertions about two PEs Pe1 and Pe2: Pe1
>> was stopped at time instant t1 and Pe2 started at time instant t2 and t2 >
>> t1. Hence Pe2 wasPrecededBy Pe1
>>
>>  b) Similarly, we have provenance assertions about two PEs Pe1, Pe2 and
>> an Entity e1: Pe1 used e1 at time t1 and PE2 used e1 at time t2 and t2 >
>> t1, hence (start of) Pe2 wasPrecededBy (start of) Pe1.
>>
>>  My suggestion to just create a single generic property for ordering of
>> PEs (Khalid had suggested using PEs instead of Process) and allow specific
>> provenance application to create more specialized PE ordering properties
>> according to their requirements.
>>
>>  2. According to the current definition of "wasScheduledAfter" we cannot
>> assert that one PE was scheduled after another PE if we don't have
>> information about the agent associated with the PEs. Further, the name of
>> the property seems to refer to the intended ordering of PEs rather than
>> actual execution of PEs - a workflow specification may have "scheduled" Pe1
>> to execute "after" Pe2, but during the workflow run, Pe2 may have executed
>> before Pe1?
>>
>>  Overall, I am not sure why we need two very special cases of PE
>> ordering property instead of using a generic "wasPrecededBy" (or
>> "wasSucceededBy") property that can be specialized as needed by different
>> provenance applications.
>>
>>  Thanks.
>>
>>  Best,
>> Satya
>>
>>  [1]
>> http://www.w3.org/2011/prov/wiki/ConsolidatedConcepts#Ordering_of_process_execution
>>
>> On Fri, Sep 23, 2011 at 8:04 AM, Luc Moreau <l.moreau@ecs.soton.ac.uk>wrote:
>>
>>>
>>> Hi Satya,
>>>
>>> Issue has been closed pending review, with the latest document version.
>>> Feel free to reopen if not appropriate.
>>>
>>> Luc
>>>
>>>
>>> On 27/07/2011 02:51, Provenance Working Group Issue Tracker wrote:
>>>
>>>> PROV-ISSUE-50 (Ordering of Process): Defintion for Ordering of Process
>>>> [Conceptual Model]
>>>>
>>>> http://www.w3.org/2011/prov/track/issues/50
>>>>
>>>> Raised by: Satya Sahoo
>>>> On product: Conceptual Model
>>>>
>>>> I am not sure where did we get the currently listed definition of
>>>> "Ordering of Process" - it is neither listed in the original provenance
>>>> concept page [1] nor in the consolidated concepts page [2].
>>>>
>>>> I had proposed the following definition:
>>>> "Ordering of processes execution (in provenance) needs to be modeled as
>>>> a property linking process entities in specific order along a particular
>>>> dimension (temporal or control flow)"
>>>>
>>>> [1]http://www.w3.org/2011/prov/wiki/ConceptOrderingOfProcesses
>>>> [2]
>>>> http://www.w3.org/2011/prov/wiki/ConsolidatedConcepts#Ordering_of_process_execution
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>>>
>>
>
> --
> Professor Luc Moreau
> Electronics and Computer Science   tel:   +44 23 8059 4487
> University of Southampton          fax:   +44 23 8059 2865
> Southampton SO17 1BJ               email: l.moreau@ecs.soton.ac.uk
> United Kingdom                     http://www.ecs.soton.ac.uk/~lavm
>
>
Received on Wednesday, 11 January 2012 18:08:49 UTC