- From: Luc Moreau <L.Moreau@ecs.soton.ac.uk>
- Date: Wed, 11 Jan 2012 21:48:53 +0000
- To: Satya Sahoo <satya.sahoo@case.edu>
- CC: public-prov-wg@w3.org
- Message-ID: <EMEW3|de03562ad7348c13a746b8f6b9e128bfo0ALn008L.Moreau|ecs.soton.ac.uk|4F0E03C5>
Thanks Satya, it's now closed. On 11/01/12 18:03, Satya Sahoo wrote: > Hi Luc, > Since the points raised in this is issue have been superseded by > updates to DM, I am comfortable in closing this issue. > > Thanks. > > Best, > Satya > > On Wed, Nov 30, 2011 at 3:45 AM, Luc Moreau <L.Moreau@ecs.soton.ac.uk > <mailto:L.Moreau@ecs.soton.ac.uk>> wrote: > > Hi Satya, > The discussion on this thread has not progressed since earlier > October. > > The latest WD contains a new relation wasStartedBy between > activities, which is > simpler than wasScheduledAfter. > > For the second time, I am proposing to formally close this issue. > > Best regards, > Luc > > > On 10/03/2011 08:05 AM, Luc Moreau wrote: >> Hi Satya, >> >> Responses interleaved. >> >> On 03/10/11 01:54, Satya Sahoo wrote: >>> Hi Luc, >>> My comments are inline: >>> >First, you will note that wasInformedBy is *not* a temporal >>> relation between process executions. >>> >>> The PROV-DM currently defines the following constraint for >>> wasInformedBy: >>> Given two process execution expressions denoted by pe1 and pe2, >>> the expression wasInformedBy(pe2,pe1) holds, if and only if >>> there is an entity expression denoted by e and qualifiers q1 and >>> q2, such that wasGeneratedBy(e,pe1,q1) and used(pe2,e,q2) hold. >>> >>> If we consider the two expressions wasGeneratedBy(e, pe1, q1) >>> and used(pe2, e, q2) - these two expressions together enforce >>> that pe2 cannot have start time that is "before" start time of >>> pe1. This is temporal relation/ordering between pe1 and pe2. >>> Hence, if both these expressions have to "hold" for >>> wasInformedBy(pe2, pe1) to "hold" I am not sure how it is not a >>> temporal ordering? >> >> I agree that some temporal constraints have to be satisfied for >> wasInformedBy(pe2, pe1), but it's a necessary condition, >> it's not a sufficient condition. Information (represented as >> entity e above) is required to flow between process executions. >> >> Also, it's not a temporal order, but it's a temporal relation! >> It is not transitive! >> >> For these reasons (information flow and non transitivity), I feel >> that wasInformedBy does not fall under >> your temporal ordering classification. >> >>> >>> >>> >Second, it would be nice for PROV to have a temporal ordering >>> relation. However, we have to be >>> >careful. The relations >>> used/generatedBy/derivedFrom/dependedOn/... all have a notion of >>> >causality/influence: the source of the edge being influenced by >>> the edge destination. >>> >We know that causal order implies temporal order, but not the >>> converse. I am therefore reluctant >>> >to introduce a relation that arbitrarily capture temporal >>> order. What would it give us? After all, >>> >we can associate time with PEs, and given such time >>> information, we can already decide if pe1 >start precedes pe2 >>> start, or if pe1 end precedes pe2 start. What would a temporal >>> relation give us >over time? >>> There are many non-causal properties that are part of provenance >>> assertions. >>> >>> For example, to reconstruct the history of activities of an >>> accused person X on Oct 2 before the X reached the crime scene, >>> the police make the following assertions: >>> 1. X bought a car at 2:00pm US ET - buying the car is PE pe1 >>> 2. X bought flowers at 4:00pm US ET- buying flowers is PE pe2 >>> 3. X hailed a taxi and travelled to crime scene at 6:00pm US ET >>> - travelling in taxi is PE pe3 >> >> This is nice example where wasScheduledAfter can be used! >> >>> >>> In the above scenario, the police need to have temporal ordering >>> of PEs to establish that person X was in the city on the day of >>> the crime but there is no causal relation between pe1, pe2, and pe3. >> >> There is some underpinning ordering, since there is X at 2pm, X >> at 4pm, and X at 6pm. >> This is exactly the definition of wasScheduledAfter. >> >>> >>> As you stated, temporal ordering may or may not represent causal >>> relation between PEs and since non-causal ordering of PEs occur >>> in many provenance applications we need to define a property for >>> temporal ordering of PEs and causality-based temporal ordering >>> is a specialization of that property. >>> >>> >>> >The relation wasScheduleAfter attempts to capture some temporal >>> ordering, with underpinning >>> >causal influence. You are incorrect to state that to assert >>> wasScheduledAfter you need to know >of an agent. It's exactly >>> the contrary. By asserting wasScheduledAfter, you also assert >>> the >existence of such an agent, but don't have to specify which >>> it is. >>> >>> The PROV-DM currently defines the following constraint >>> for wasScheduledAfter: >>> Given two process execution expressions denoted by pe1 and pe2, >>> the expression wasScheduledAfter(pe2,pe1) holds, if and only if >>> there are two entity expressions denoted by e1 and e2, such that >>> wasControlledBy(pe1,e1,qualifier(role="end")) and >>> wasControlledBy(pe2,e2,qualifier(role="start")) and >>> wasDerivedFrom(e2,e1). >>> and >>> This definition assumes that the activities represented by >>> process execution expressions identified by pe1 and pe2 are >>> controlled by some agents, represented by expressions identified >>> by e1 and e2, where the first agent terminates (control >>> qualifier qualifier(role="end")) the first activity, and the >>> second initiates (control qualifier qualifier(role="start")) the >>> second. The second agent being "derived" from the first enforces >>> temporal ordering. If we don't know which are the Agents >>> associated with pe1 and pe2 then how can we state that they are >>> entities with identifiers e1 and e2? >>> >>> In other words, if there are two PEs (from Taverna workflows) - >>> retrieveGeneSequence and runBLASTService and John (the research >>> robot) ended retrieveGeneSequence and Tom (the research robot - >>> derived from John) started runBLASTService - then we can assert >>> that runBLASTService wasScheduledAfter retrieveGeneSequence. >>> >>> But, if don't know which Agents are associated with >>> retrieveGeneSequence and runBLASTService PEs then how can we >>> assert wasScheduledAfter property between the two PEs? >> >> You will note that the constraint you copied contains "if and >> only if", so it is defining the expression >> wasScheduledAfter(pe2,pe1). >> It is therefore fine to assert it. The existential quantifier >> states the existence of agents, but when asserting wasScheduledAfter >> you don't need to know their identity. Vice-versa, if you know >> them and all other constraints are satisfied, than you can infer >> a WasScheduledAfter expression. >> >>> >>> There maybe a third robot Albert and it is not related to either >>> Tom or John by wasDerivedFrom property. But, a provenance >>> application has to know which of three robots (agents) are >>> associated with the two PEs (and then verify that there is a >>> wasDerivedFrom property linking the two robots). >>> >>> The constraint defined for wasScheduledAfter is a rule and for >>> the rule to "fire" its conditions have to evaluate to "true". >>> >>> Just knowing that there exist some Agent associated >>> with retrieveGeneSequence and runBLASTService PEs will not make >>> the constraint evaluate to "true" - the provenance application >>> has to specify which Agents (John and Tom) were associated with >>> the two PEs. >>> >>> Hence, according to the current PROV-DM text, my understanding >>> is that a provenance application will need to know about the >>> specific agents associated with PEs before they can use the >>> wasScheduledAfter property. This information may or may not be >>> available to a provenance application. >>> >>> Therefore I am raising the need for a generic ordering property >>> for PEs that can be simply asserted by provenance applications. >>> Similar to other provenance assertions the ordering of PEs can >>> be verified later using either timestamps or causal relations >>> constraints. >> >> You have not answered my point. What does this give you that you >> can't infer from time information? >>> >>> >Final point, your reference [1] had not been agreed, it is the >>> proposal you made back then. >>> Hence, I had raised this issue (Issue-50) to discuss the >>> property. To clarify, has there been discussions or agreement on >>> the two properties isInformedBy and wasScheduledAfter (I may >>> have missed the particular mails in the mailing list)? >> >> To my knowledge, this thread is the only one discussing these >> issues. As Paul indicated a while back, the proposal >> is aligned with the rest of the document. >> >> I would like to see you putting a proper definition of the >> concept you would have in mind. I would argue >> that your original text in [1] is not a definition but a >> requirement to be satisfied. Can you define this notion of >> temporal order in >> terms of the other "building block" of PROV (e.g. process >> start/end etc). >> >> Ultimately, we could introduce Allen's relations >> (http://en.wikipedia.org/wiki/Allen's_Interval_Algebra >> <http://en.wikipedia.org/wiki/Allen%27s_Interval_Algebra>) >> but I am not sure it would be helpful in this context. >> >> Cheers, >> Luc >> >>> >>> Thanks. >>> >>> Best, >>> Satya >>> >>> On Sun, Oct 2, 2011 at 9:58 AM, Luc Moreau >>> <L.Moreau@ecs.soton.ac.uk <mailto:L.Moreau@ecs.soton.ac.uk>> wrote: >>> >>> Hi Satya, >>> >>> First, you will note that wasInformedBy is *not* a temporal >>> relation between process executions. >>> It is *not* transitive. It requires information to flow >>> between two PEs. For wasInformedBy(pe1,pe2), >>> a minimum constraint is that the end of pe2 does *not* >>> precede the start of pe1. >>> The data journalism example had an illustration of such >>> relation. It has been established to be useful >>> both theoretically and practically. >>> >>> Second, it would be nice for PROV to have a temporal >>> ordering relation. However, we have to be >>> careful. The relations >>> used/generatedBy/derivedFrom/dependedOn/... all have a >>> notion of causality/influence: >>> the source of the edge being influenced by the edge destination. >>> >>> We know that causal order implies temporal order, but not >>> the converse. I am therefore reluctant >>> to introduce a relation that arbitrarily capture temporal >>> order. What would it give us? After all, >>> we can associate time with >>> PEs, and given such time information, we can already decide >>> if pe1 start precedes pe2 start, or if pe1 end >>> precedes pe2 start. What would a temporal relation give us >>> over time? >>> >>> The relation wasScheduleAfter attempts to capture some >>> temporal ordering, with underpinning >>> causal influence. You are incorrect to state that to assert >>> wasScheduledAfter you need to know of an agent. >>> It's exactly the contrary. By asserting wasScheduledAfter, >>> you also assert the existence of such an >>> agent, but don't have to specify which it is. >>> >>> Final point, your reference [1] had not been agreed, it is >>> the proposal you made back then. >>> >>> So, in conclusion: >>> 1. I would argue that wasInformedBy is useful, and should be >>> kept as such, ... and definitely cannot >>> be subsumed by some temporal ordering. >>> >>> 2. Temporal ordering *with* some form of underpinning causal >>> influence, is also useful. I agree that >>> wasScheduledAfter is a first attempt. Maybe somebody can >>> put forward alternative definitions. >>> >>> Cheers, >>> Luc >>> >>> >>> On 02/10/11 02:03, Satya Sahoo wrote: >>>> Hi Luc, >>>> I would like to re-raise this issue since the two >>>> properties defined in PROV-DM, "wasInformedBy" and >>>> "wasScheduledAfter" do not represent the original property >>>> for ordering process executions that was agreed to by the >>>> provenance incubator group and also during the first F2F [1]. >>>> >>>> I believe there are primarily two dimensions/constraints >>>> for ordering process executions: >>>> a) Two PEs are scheduled (by agent/user) to execute in >>>> particular order at specific time instants, which we can >>>> represent as *time-based ordering of PEs*. Of course, >>>> additional information about which agent/user started or >>>> stopped the PEs can be specified, but the time value >>>> primarily define the ordering of the PEs. >>>> >>>> b) A PE pe1 is designed to initiate/start a second PE pe2 >>>> (due to some condition being satisfied for example a >>>> specific state was reached or some entity became >>>> available), which we can represent as a *control-based >>>> ordering of PEs*. This ordering of process cannot be >>>> effectively captured by time-based ordering, since pe1 may >>>> still be executing while pe2 starts. >>>> >>>> Both these cases are captured by the property >>>> "wasPrecededBy" (the corresponding property in opposite >>>> direction can be "wasSucceededBy") where the PEs were >>>> ordered according to their time of start/stop or explicit >>>> start/stop by another PE. >>>> >>>> Some specific comments on the current PROV-DM document >>>> Section 5.3.6 Ordering of Process Executions >>>> ===== >>>> 1. An information flow ordering expression is a >>>> representation that a characterized thing was generated by >>>> an activity, represented by a process execution expresion, >>>> before it was used by another activity, also represented by >>>> a process execution expression. >>>> >>>> Issue: This is a particular case of "time-based ordering", >>>> there can multiple others. For example, >>>> >>>> a) We can have the provenance assertions about two PEs Pe1 >>>> and Pe2: Pe1 was stopped at time instant t1 and Pe2 started >>>> at time instant t2 and t2 > t1. Hence Pe2 wasPrecededBy Pe1 >>>> >>>> b) Similarly, we have provenance assertions about two PEs >>>> Pe1, Pe2 and an Entity e1: Pe1 used e1 at time t1 and PE2 >>>> used e1 at time t2 and t2 > t1, hence (start of) Pe2 >>>> wasPrecededBy (start of) Pe1. >>>> >>>> My suggestion to just create a single generic property for >>>> ordering of PEs (Khalid had suggested using PEs instead of >>>> Process) and allow specific provenance application to >>>> create more specialized PE ordering properties according to >>>> their requirements. >>>> >>>> 2. According to the current definition of >>>> "wasScheduledAfter" we cannot assert that one PE was >>>> scheduled after another PE if we don't have information >>>> about the agent associated with the PEs. Further, the name >>>> of the property seems to refer to the intended ordering of >>>> PEs rather than actual execution of PEs - a workflow >>>> specification may have "scheduled" Pe1 to execute "after" >>>> Pe2, but during the workflow run, Pe2 may have executed >>>> before Pe1? >>>> >>>> Overall, I am not sure why we need two very special cases >>>> of PE ordering property instead of using a generic >>>> "wasPrecededBy" (or "wasSucceededBy") property that can be >>>> specialized as needed by different provenance applications. >>>> >>>> Thanks. >>>> >>>> Best, >>>> Satya >>>> >>>> [1] >>>> http://www.w3.org/2011/prov/wiki/ConsolidatedConcepts#Ordering_of_process_execution >>>> >>>> On Fri, Sep 23, 2011 at 8:04 AM, Luc Moreau >>>> <l.moreau@ecs.soton.ac.uk >>>> <mailto:l.moreau@ecs.soton.ac.uk>> wrote: >>>> >>>> >>>> Hi Satya, >>>> >>>> Issue has been closed pending review, with the latest >>>> document version. >>>> Feel free to reopen if not appropriate. >>>> >>>> Luc >>>> >>>> >>>> On 27/07/2011 02:51, Provenance Working Group Issue >>>> Tracker wrote: >>>> >>>> PROV-ISSUE-50 (Ordering of Process): Defintion for >>>> Ordering of Process [Conceptual Model] >>>> >>>> http://www.w3.org/2011/prov/track/issues/50 >>>> >>>> Raised by: Satya Sahoo >>>> On product: Conceptual Model >>>> >>>> I am not sure where did we get the currently listed >>>> definition of "Ordering of Process" - it is neither >>>> listed in the original provenance concept page [1] >>>> nor in the consolidated concepts page [2]. >>>> >>>> I had proposed the following definition: >>>> "Ordering of processes execution (in provenance) >>>> needs to be modeled as a property linking process >>>> entities in specific order along a particular >>>> dimension (temporal or control flow)" >>>> >>>> [1]http://www.w3.org/2011/prov/wiki/ConceptOrderingOfProcesses >>>> [2] >>>> http://www.w3.org/2011/prov/wiki/ConsolidatedConcepts#Ordering_of_process_execution >>>> >>>> >>>> >>>> >>>> >>>> >>> > > -- > Professor Luc Moreau > Electronics and Computer Science tel:+44 23 8059 4487 <tel:%2B44%2023%208059%204487> > University of Southampton fax:+44 23 8059 2865 <tel:%2B44%2023%208059%202865> > Southampton SO17 1BJ email:l.moreau@ecs.soton.ac.uk <mailto:l.moreau@ecs.soton.ac.uk> > United Kingdomhttp://www.ecs.soton.ac.uk/~lavm <http://www.ecs.soton.ac.uk/%7Elavm> > > >
Received on Wednesday, 11 January 2012 21:52:01 UTC