W3C home > Mailing lists > Public > public-prov-wg@w3.org > January 2012

Re: PROV-ISSUE-50 (Ordering of Process): Defintion for Ordering of Process [Conceptual Model]

From: Luc Moreau <L.Moreau@ecs.soton.ac.uk>
Date: Wed, 11 Jan 2012 21:48:53 +0000
Message-ID: <EMEW3|de03562ad7348c13a746b8f6b9e128bfo0ALn008L.Moreau|ecs.soton.ac.uk|4F0E03C5.3020807@ecs.soton.ac.uk>
To: Satya Sahoo <satya.sahoo@case.edu>
CC: public-prov-wg@w3.org
Thanks Satya, it's now closed.

On 11/01/12 18:03, Satya Sahoo wrote:
> Hi Luc,
> Since the points raised in this is issue have been superseded by 
> updates to DM, I am comfortable in closing this issue.
>
> Thanks.
>
> Best,
> Satya
>
> On Wed, Nov 30, 2011 at 3:45 AM, Luc Moreau <L.Moreau@ecs.soton.ac.uk 
> <mailto:L.Moreau@ecs.soton.ac.uk>> wrote:
>
>     Hi Satya,
>     The discussion on this thread has not progressed since earlier
>     October.
>
>     The latest WD contains a new relation wasStartedBy between
>     activities, which is
>     simpler than wasScheduledAfter.
>
>     For the second time, I am proposing to formally close this issue.
>
>     Best regards,
>     Luc
>
>
>     On 10/03/2011 08:05 AM, Luc Moreau wrote:
>>     Hi Satya,
>>
>>     Responses interleaved.
>>
>>     On 03/10/11 01:54, Satya Sahoo wrote:
>>>     Hi Luc,
>>>     My comments are inline:
>>>     >First, you will note that wasInformedBy is *not* a temporal
>>>     relation between process executions.
>>>
>>>     The PROV-DM currently defines the following constraint for
>>>     wasInformedBy:
>>>     Given two process execution expressions denoted by pe1 and pe2,
>>>     the expression wasInformedBy(pe2,pe1) holds, if and only if
>>>     there is an entity expression denoted by e and qualifiers q1 and
>>>     q2, such that wasGeneratedBy(e,pe1,q1) and used(pe2,e,q2) hold.
>>>
>>>     If we consider the two expressions wasGeneratedBy(e, pe1, q1)
>>>     and used(pe2, e, q2) - these two expressions together enforce
>>>     that pe2 cannot have start time that is "before" start time of
>>>     pe1. This is temporal relation/ordering between pe1 and pe2.
>>>     Hence, if both these expressions have to "hold" for
>>>     wasInformedBy(pe2, pe1) to "hold" I am not sure how it is not a
>>>     temporal ordering?
>>
>>     I agree that some temporal constraints have to be satisfied for
>>     wasInformedBy(pe2, pe1), but it's a necessary condition,
>>     it's not a sufficient condition.  Information (represented as
>>     entity e above) is required to flow between process executions.
>>
>>     Also, it's not a temporal order, but it's a temporal relation! 
>>     It is not transitive!
>>
>>     For these reasons (information flow and non transitivity), I feel
>>     that wasInformedBy does not fall under
>>     your temporal ordering classification.
>>
>>>
>>>
>>>     >Second, it would be nice for PROV to have a temporal ordering
>>>     relation. However, we have to be
>>>     >careful. The relations
>>>     used/generatedBy/derivedFrom/dependedOn/... all have a notion of
>>>     >causality/influence: the source of the edge being influenced by
>>>     the edge destination.
>>>     >We know that causal order implies temporal order, but not the
>>>     converse.  I am therefore reluctant
>>>     >to introduce a relation that arbitrarily capture  temporal
>>>     order.  What would it give us? After all,
>>>     >we can associate time with PEs, and given such time
>>>     information, we can already decide if pe1 >start precedes pe2
>>>     start, or if pe1 end precedes pe2 start. What would a temporal
>>>     relation give us >over time?
>>>     There are many non-causal properties that are part of provenance
>>>     assertions.
>>>
>>>     For example, to reconstruct the history of activities of an
>>>     accused person X on Oct 2 before the X reached the crime scene,
>>>     the police make the following assertions:
>>>     1. X bought a car at 2:00pm US ET - buying the car is PE pe1
>>>     2. X bought flowers at 4:00pm US ET- buying flowers is PE pe2
>>>     3. X hailed a taxi and travelled to crime scene at 6:00pm US ET
>>>     - travelling in taxi is PE pe3
>>
>>     This is  nice example where wasScheduledAfter can be used!
>>
>>>
>>>     In the above scenario, the police need to have temporal ordering
>>>     of PEs to establish that person X was in the city on the day of
>>>     the crime but there is no causal relation between pe1, pe2, and pe3.
>>
>>     There is some underpinning ordering, since there is X at 2pm, X
>>     at 4pm, and X at 6pm.
>>     This is exactly the definition of wasScheduledAfter.
>>
>>>
>>>     As you stated, temporal ordering may or may not represent causal
>>>     relation between PEs and since non-causal ordering of PEs occur
>>>     in many provenance applications we need to define a property for
>>>     temporal ordering of PEs and causality-based temporal ordering
>>>     is a specialization of that property.
>>>
>>>
>>>     >The relation wasScheduleAfter attempts to capture some temporal
>>>     ordering, with underpinning
>>>     >causal influence.  You are incorrect to state that to assert
>>>     wasScheduledAfter you need to know >of an agent. It's exactly
>>>     the contrary. By asserting wasScheduledAfter, you also assert
>>>     the >existence of such an agent, but don't have to specify which
>>>     it is.
>>>
>>>     The PROV-DM currently defines the following constraint
>>>     for wasScheduledAfter:
>>>     Given two process execution expressions denoted by pe1 and pe2,
>>>     the expression wasScheduledAfter(pe2,pe1) holds, if and only if
>>>     there are two entity expressions denoted by e1 and e2, such that
>>>     wasControlledBy(pe1,e1,qualifier(role="end")) and
>>>     wasControlledBy(pe2,e2,qualifier(role="start")) and
>>>     wasDerivedFrom(e2,e1).
>>>     and
>>>     This definition assumes that the activities represented by
>>>     process execution expressions identified by pe1 and pe2 are
>>>     controlled by some agents, represented by expressions identified
>>>     by e1 and e2, where the first agent terminates (control
>>>     qualifier qualifier(role="end")) the first activity, and the
>>>     second initiates (control qualifier qualifier(role="start")) the
>>>     second. The second agent being "derived" from the first enforces
>>>     temporal ordering. If we don't know which are the Agents
>>>     associated with pe1 and pe2 then how can we state that they are
>>>     entities with identifiers e1 and e2?
>>>
>>>     In other words, if there are two PEs (from Taverna workflows) -
>>>     retrieveGeneSequence and runBLASTService and John (the research
>>>     robot) ended retrieveGeneSequence and Tom (the research robot -
>>>     derived from John) started runBLASTService - then we can assert
>>>     that runBLASTService wasScheduledAfter retrieveGeneSequence.
>>>
>>>     But, if don't know which Agents are associated with
>>>     retrieveGeneSequence and runBLASTService PEs then how can we
>>>     assert wasScheduledAfter property between the two PEs?
>>
>>     You will note that the constraint you copied contains "if and
>>     only if", so it is defining the expression
>>     wasScheduledAfter(pe2,pe1).
>>     It is therefore fine to assert it. The existential quantifier
>>     states the existence of agents, but when asserting wasScheduledAfter
>>     you don't need to know their identity. Vice-versa, if you know
>>     them and all other constraints are satisfied, than you can infer
>>     a WasScheduledAfter expression.
>>
>>>
>>>     There maybe a third robot Albert and it is not related to either
>>>     Tom or John by wasDerivedFrom property. But, a provenance
>>>     application has to know which of three robots (agents) are
>>>     associated with the two PEs (and then verify that there is a
>>>     wasDerivedFrom property linking the two robots).
>>>
>>>     The constraint defined for wasScheduledAfter is a rule and for
>>>     the rule to "fire" its conditions have to evaluate to "true".
>>>
>>>     Just knowing that there exist some Agent associated
>>>     with retrieveGeneSequence and runBLASTService PEs will not make
>>>     the constraint evaluate to "true" - the provenance application
>>>     has to specify which Agents (John and Tom) were associated with
>>>     the two PEs.
>>>
>>>     Hence, according to the current PROV-DM text, my understanding
>>>     is that a provenance application will need to know about the
>>>     specific agents associated with PEs before they can use the
>>>     wasScheduledAfter property. This information may or may not be
>>>     available to a provenance application.
>>>
>>>     Therefore I am raising the need for a generic ordering property
>>>     for PEs that can be simply asserted by provenance applications.
>>>     Similar to other provenance assertions the ordering of PEs can
>>>     be verified later using either timestamps or causal relations
>>>     constraints.
>>
>>     You have not answered my point. What does this give you that you
>>     can't infer from time information?
>>>
>>>     >Final point, your reference [1] had not been agreed, it is the
>>>     proposal you made back then.
>>>     Hence, I had raised this issue (Issue-50) to discuss the
>>>     property. To clarify, has there been discussions or agreement on
>>>     the two properties isInformedBy and wasScheduledAfter (I may
>>>     have missed the particular mails in the mailing list)?
>>
>>     To my knowledge, this thread is the only one discussing these
>>     issues.  As Paul indicated a while back, the proposal
>>     is aligned with the rest of the document.
>>
>>     I would like to see you putting a proper definition of the
>>     concept you would have in mind.  I would argue
>>     that your original text in [1] is not a definition but a
>>     requirement to be satisfied. Can you define this notion of
>>     temporal order in
>>     terms of the other "building block" of PROV (e.g. process
>>     start/end etc).
>>
>>     Ultimately, we could introduce Allen's relations
>>     (http://en.wikipedia.org/wiki/Allen's_Interval_Algebra
>>     <http://en.wikipedia.org/wiki/Allen%27s_Interval_Algebra>)
>>     but I am not sure it would be helpful in this context.
>>
>>     Cheers,
>>     Luc
>>
>>>
>>>     Thanks.
>>>
>>>     Best,
>>>     Satya
>>>
>>>     On Sun, Oct 2, 2011 at 9:58 AM, Luc Moreau
>>>     <L.Moreau@ecs.soton.ac.uk <mailto:L.Moreau@ecs.soton.ac.uk>> wrote:
>>>
>>>         Hi Satya,
>>>
>>>         First, you will note that wasInformedBy is *not* a temporal
>>>         relation between process executions.
>>>         It is *not* transitive.  It requires information to flow
>>>         between two PEs.  For wasInformedBy(pe1,pe2),
>>>         a minimum constraint is that the end of pe2 does *not*
>>>         precede the start of pe1.
>>>         The data journalism example had an illustration of such
>>>         relation. It has been established to be useful
>>>         both theoretically and practically.
>>>
>>>         Second, it would be nice for PROV to have a temporal
>>>         ordering relation. However, we have to be
>>>         careful. The relations
>>>         used/generatedBy/derivedFrom/dependedOn/... all have a
>>>         notion of causality/influence:
>>>         the source of the edge being influenced by the edge destination.
>>>
>>>         We know that causal order implies temporal order, but not
>>>         the converse.  I am therefore reluctant
>>>         to introduce a relation that arbitrarily capture  temporal
>>>         order.  What would it give us? After all,
>>>         we can associate time with
>>>         PEs, and given such time information, we can already decide
>>>         if pe1 start precedes pe2 start, or if pe1 end
>>>         precedes pe2 start. What would a temporal relation give us
>>>         over time?
>>>
>>>         The relation wasScheduleAfter attempts to capture some
>>>         temporal ordering, with underpinning
>>>         causal influence.  You are incorrect to state that to assert
>>>         wasScheduledAfter you need to know of an agent.
>>>         It's exactly the contrary. By asserting wasScheduledAfter,
>>>         you also assert the existence of such an
>>>         agent, but don't have to specify which it is.
>>>
>>>         Final point, your reference [1] had not been agreed, it is
>>>         the proposal you made back then.
>>>
>>>         So, in conclusion:
>>>         1. I would argue that wasInformedBy is useful, and should be
>>>         kept as such, ... and definitely cannot
>>>            be subsumed by some temporal ordering.
>>>
>>>         2. Temporal ordering *with* some form of underpinning causal
>>>         influence, is also useful. I agree that
>>>            wasScheduledAfter is a first attempt. Maybe somebody can
>>>         put forward alternative definitions.
>>>
>>>         Cheers,
>>>         Luc
>>>
>>>
>>>         On 02/10/11 02:03, Satya Sahoo wrote:
>>>>         Hi Luc,
>>>>         I would like to re-raise this issue since the two
>>>>         properties defined in PROV-DM, "wasInformedBy" and
>>>>         "wasScheduledAfter" do not represent the original property
>>>>         for ordering process executions that was agreed to by the
>>>>         provenance incubator group and also during the first F2F [1].
>>>>
>>>>         I believe there are primarily two dimensions/constraints
>>>>         for ordering process executions:
>>>>         a) Two PEs are scheduled (by agent/user) to execute in
>>>>         particular order at specific time instants, which we can
>>>>         represent as *time-based ordering of PEs*. Of course,
>>>>         additional information about which agent/user started or
>>>>         stopped the PEs can be specified, but the time value
>>>>         primarily define the ordering of the PEs.
>>>>
>>>>         b) A PE pe1 is designed to initiate/start a second PE pe2
>>>>         (due to some condition being satisfied for example a
>>>>         specific state was reached or some entity became
>>>>         available), which we can represent as a *control-based
>>>>         ordering of PEs*. This ordering of process cannot be
>>>>         effectively captured by time-based ordering, since pe1 may
>>>>         still be executing while pe2 starts.
>>>>
>>>>         Both these cases are captured by the property
>>>>         "wasPrecededBy" (the corresponding property in opposite
>>>>         direction can be "wasSucceededBy") where the PEs were
>>>>         ordered according to their time of start/stop or explicit
>>>>         start/stop by another PE.
>>>>
>>>>         Some specific comments on the current PROV-DM document
>>>>         Section 5.3.6 Ordering of Process Executions
>>>>         =====
>>>>         1. An information flow ordering expression is a
>>>>         representation that a characterized thing was generated by
>>>>         an activity, represented by a process execution expresion,
>>>>         before it was used by another activity, also represented by
>>>>         a process execution expression.
>>>>
>>>>         Issue: This is a particular case of "time-based ordering",
>>>>         there can multiple others. For example,
>>>>
>>>>         a) We can have the provenance assertions about two PEs Pe1
>>>>         and Pe2: Pe1 was stopped at time instant t1 and Pe2 started
>>>>         at time instant t2 and t2 > t1. Hence Pe2 wasPrecededBy Pe1
>>>>
>>>>         b) Similarly, we have provenance assertions about two PEs
>>>>         Pe1, Pe2 and an Entity e1: Pe1 used e1 at time t1 and PE2
>>>>         used e1 at time t2 and t2 > t1, hence (start of) Pe2
>>>>         wasPrecededBy (start of) Pe1.
>>>>
>>>>         My suggestion to just create a single generic property for
>>>>         ordering of PEs (Khalid had suggested using PEs instead of
>>>>         Process) and allow specific provenance application to
>>>>         create more specialized PE ordering properties according to
>>>>         their requirements.
>>>>
>>>>         2. According to the current definition of
>>>>         "wasScheduledAfter" we cannot assert that one PE was
>>>>         scheduled after another PE if we don't have information
>>>>         about the agent associated with the PEs. Further, the name
>>>>         of the property seems to refer to the intended ordering of
>>>>         PEs rather than actual execution of PEs - a workflow
>>>>         specification may have "scheduled" Pe1 to execute "after"
>>>>         Pe2, but during the workflow run, Pe2 may have executed
>>>>         before Pe1?
>>>>
>>>>         Overall, I am not sure why we need two very special cases
>>>>         of PE ordering property instead of using a generic
>>>>         "wasPrecededBy" (or "wasSucceededBy") property that can be
>>>>         specialized as needed by different provenance applications.
>>>>
>>>>         Thanks.
>>>>
>>>>         Best,
>>>>         Satya
>>>>
>>>>         [1]
>>>>         http://www.w3.org/2011/prov/wiki/ConsolidatedConcepts#Ordering_of_process_execution
>>>>
>>>>         On Fri, Sep 23, 2011 at 8:04 AM, Luc Moreau
>>>>         <l.moreau@ecs.soton.ac.uk
>>>>         <mailto:l.moreau@ecs.soton.ac.uk>> wrote:
>>>>
>>>>
>>>>             Hi Satya,
>>>>
>>>>             Issue has been closed pending review, with the latest
>>>>             document version.
>>>>             Feel free to reopen if not appropriate.
>>>>
>>>>             Luc
>>>>
>>>>
>>>>             On 27/07/2011 02:51, Provenance Working Group Issue
>>>>             Tracker wrote:
>>>>
>>>>                 PROV-ISSUE-50 (Ordering of Process): Defintion for
>>>>                 Ordering of Process [Conceptual Model]
>>>>
>>>>                 http://www.w3.org/2011/prov/track/issues/50
>>>>
>>>>                 Raised by: Satya Sahoo
>>>>                 On product: Conceptual Model
>>>>
>>>>                 I am not sure where did we get the currently listed
>>>>                 definition of "Ordering of Process" - it is neither
>>>>                 listed in the original provenance concept page [1]
>>>>                 nor in the consolidated concepts page [2].
>>>>
>>>>                 I had proposed the following definition:
>>>>                 "Ordering of processes execution (in provenance)
>>>>                 needs to be modeled as a property linking process
>>>>                 entities in specific order along a particular
>>>>                 dimension (temporal or control flow)"
>>>>
>>>>                 [1]http://www.w3.org/2011/prov/wiki/ConceptOrderingOfProcesses
>>>>                 [2]
>>>>                 http://www.w3.org/2011/prov/wiki/ConsolidatedConcepts#Ordering_of_process_execution
>>>>
>>>>
>>>>
>>>>
>>>>
>>>>
>>>
>
>     -- 
>     Professor Luc Moreau
>     Electronics and Computer Science   tel:+44 23 8059 4487  <tel:%2B44%2023%208059%204487>
>     University of Southampton          fax:+44 23 8059 2865  <tel:%2B44%2023%208059%202865>
>     Southampton SO17 1BJ               email:l.moreau@ecs.soton.ac.uk  <mailto:l.moreau@ecs.soton.ac.uk>
>     United Kingdomhttp://www.ecs.soton.ac.uk/~lavm  <http://www.ecs.soton.ac.uk/%7Elavm>
>          
>
>
Received on Wednesday, 11 January 2012 21:52:01 UTC

This archive was generated by hypermail 2.3.1 : Tuesday, 6 January 2015 21:58:11 UTC