- From: Luc Moreau <L.Moreau@ecs.soton.ac.uk>
- Date: Wed, 30 Nov 2011 08:45:46 +0000
- To: public-prov-wg@w3.org
- Message-ID: <EMEW3|58d2038a8cddd3003b4e3a3c3ecdf6fdnAY8jr08L.Moreau|ecs.soton.ac.uk|4ED5ED3A>
Hi Satya, The discussion on this thread has not progressed since earlier October. The latest WD contains a new relation wasStartedBy between activities, which is simpler than wasScheduledAfter. For the second time, I am proposing to formally close this issue. Best regards, Luc On 10/03/2011 08:05 AM, Luc Moreau wrote: > Hi Satya, > > Responses interleaved. > > On 03/10/11 01:54, Satya Sahoo wrote: >> Hi Luc, >> My comments are inline: >> >First, you will note that wasInformedBy is *not* a temporal relation >> between process executions. >> >> The PROV-DM currently defines the following constraint for wasInformedBy: >> Given two process execution expressions denoted by pe1 and pe2, the >> expression wasInformedBy(pe2,pe1) holds, if and only if there is an >> entity expression denoted by e and qualifiers q1 and q2, such that >> wasGeneratedBy(e,pe1,q1) and used(pe2,e,q2) hold. >> >> If we consider the two expressions wasGeneratedBy(e, pe1, q1) and >> used(pe2, e, q2) - these two expressions together enforce that pe2 >> cannot have start time that is "before" start time of pe1. This is >> temporal relation/ordering between pe1 and pe2. Hence, if both these >> expressions have to "hold" for wasInformedBy(pe2, pe1) to "hold" I am >> not sure how it is not a temporal ordering? > > I agree that some temporal constraints have to be satisfied for > wasInformedBy(pe2, pe1), but it's a necessary condition, > it's not a sufficient condition. Information (represented as entity e > above) is required to flow between process executions. > > Also, it's not a temporal order, but it's a temporal relation! It is > not transitive! > > For these reasons (information flow and non transitivity), I feel that > wasInformedBy does not fall under > your temporal ordering classification. > >> >> >> >Second, it would be nice for PROV to have a temporal ordering >> relation. However, we have to be >> >careful. The relations used/generatedBy/derivedFrom/dependedOn/... >> all have a notion of >causality/influence: the source of the edge >> being influenced by the edge destination. >> >We know that causal order implies temporal order, but not the >> converse. I am therefore reluctant >> >to introduce a relation that arbitrarily capture temporal order. >> What would it give us? After all, >> >we can associate time with PEs, and given such time information, we >> can already decide if pe1 >start precedes pe2 start, or if pe1 >> end precedes pe2 start. What would a temporal relation give us >over >> time? >> There are many non-causal properties that are part of provenance >> assertions. >> >> For example, to reconstruct the history of activities of an accused >> person X on Oct 2 before the X reached the crime scene, the police >> make the following assertions: >> 1. X bought a car at 2:00pm US ET - buying the car is PE pe1 >> 2. X bought flowers at 4:00pm US ET- buying flowers is PE pe2 >> 3. X hailed a taxi and travelled to crime scene at 6:00pm US ET - >> travelling in taxi is PE pe3 > > This is nice example where wasScheduledAfter can be used! > >> >> In the above scenario, the police need to have temporal ordering of >> PEs to establish that person X was in the city on the day of the >> crime but there is no causal relation between pe1, pe2, and pe3. > > There is some underpinning ordering, since there is X at 2pm, X at > 4pm, and X at 6pm. > This is exactly the definition of wasScheduledAfter. > >> >> As you stated, temporal ordering may or may not represent causal >> relation between PEs and since non-causal ordering of PEs occur in >> many provenance applications we need to define a property for >> temporal ordering of PEs and causality-based temporal ordering is a >> specialization of that property. >> >> >> >The relation wasScheduleAfter attempts to capture some temporal >> ordering, with underpinning >> >causal influence. You are incorrect to state that to assert >> wasScheduledAfter you need to know >of an agent. It's exactly the >> contrary. By asserting wasScheduledAfter, you also assert the >> >existence of such an agent, but don't have to specify which it is. >> >> The PROV-DM currently defines the following constraint >> for wasScheduledAfter: >> Given two process execution expressions denoted by pe1 and pe2, the >> expression wasScheduledAfter(pe2,pe1) holds, if and only if there are >> two entity expressions denoted by e1 and e2, such that >> wasControlledBy(pe1,e1,qualifier(role="end")) and >> wasControlledBy(pe2,e2,qualifier(role="start")) and >> wasDerivedFrom(e2,e1). >> and >> This definition assumes that the activities represented by process >> execution expressions identified by pe1 and pe2 are controlled by >> some agents, represented by expressions identified by e1 and e2, >> where the first agent terminates (control qualifier >> qualifier(role="end")) the first activity, and the second initiates >> (control qualifier qualifier(role="start")) the second. The second >> agent being "derived" from the first enforces temporal ordering. If >> we don't know which are the Agents associated with pe1 and pe2 then >> how can we state that they are entities with identifiers e1 and e2? >> >> In other words, if there are two PEs (from Taverna workflows) - >> retrieveGeneSequence and runBLASTService and John (the research >> robot) ended retrieveGeneSequence and Tom (the research robot - >> derived from John) started runBLASTService - then we can assert that >> runBLASTService wasScheduledAfter retrieveGeneSequence. >> >> But, if don't know which Agents are associated with >> retrieveGeneSequence and runBLASTService PEs then how can we assert >> wasScheduledAfter property between the two PEs? > > You will note that the constraint you copied contains "if and only > if", so it is defining the expression wasScheduledAfter(pe2,pe1). > It is therefore fine to assert it. The existential quantifier states > the existence of agents, but when asserting wasScheduledAfter > you don't need to know their identity. Vice-versa, if you know them > and all other constraints are satisfied, than you can infer > a WasScheduledAfter expression. > >> >> There maybe a third robot Albert and it is not related to either Tom >> or John by wasDerivedFrom property. But, a provenance application has >> to know which of three robots (agents) are associated with the two >> PEs (and then verify that there is a wasDerivedFrom property linking >> the two robots). >> >> The constraint defined for wasScheduledAfter is a rule and for the >> rule to "fire" its conditions have to evaluate to "true". >> >> Just knowing that there exist some Agent associated >> with retrieveGeneSequence and runBLASTService PEs will not make the >> constraint evaluate to "true" - the provenance application has to >> specify which Agents (John and Tom) were associated with the two PEs. >> >> Hence, according to the current PROV-DM text, my understanding is >> that a provenance application will need to know about the specific >> agents associated with PEs before they can use the wasScheduledAfter >> property. This information may or may not be available to a >> provenance application. >> >> Therefore I am raising the need for a generic ordering property for >> PEs that can be simply asserted by provenance applications. Similar >> to other provenance assertions the ordering of PEs can be verified >> later using either timestamps or causal relations constraints. > > You have not answered my point. What does this give you that you can't > infer from time information? >> >> >Final point, your reference [1] had not been agreed, it is the >> proposal you made back then. >> Hence, I had raised this issue (Issue-50) to discuss the property. To >> clarify, has there been discussions or agreement on the two >> properties isInformedBy and wasScheduledAfter (I may have missed the >> particular mails in the mailing list)? > > To my knowledge, this thread is the only one discussing these issues. > As Paul indicated a while back, the proposal > is aligned with the rest of the document. > > I would like to see you putting a proper definition of the concept you > would have in mind. I would argue > that your original text in [1] is not a definition but a requirement > to be satisfied. Can you define this notion of temporal order in > terms of the other "building block" of PROV (e.g. process start/end etc). > > Ultimately, we could introduce Allen's relations > (http://en.wikipedia.org/wiki/Allen's_Interval_Algebra) > but I am not sure it would be helpful in this context. > > Cheers, > Luc > >> >> Thanks. >> >> Best, >> Satya >> >> On Sun, Oct 2, 2011 at 9:58 AM, Luc Moreau <L.Moreau@ecs.soton.ac.uk >> <mailto:L.Moreau@ecs.soton.ac.uk>> wrote: >> >> Hi Satya, >> >> First, you will note that wasInformedBy is *not* a temporal >> relation between process executions. >> It is *not* transitive. It requires information to flow between >> two PEs. For wasInformedBy(pe1,pe2), >> a minimum constraint is that the end of pe2 does *not* precede >> the start of pe1. >> The data journalism example had an illustration of such relation. >> It has been established to be useful >> both theoretically and practically. >> >> Second, it would be nice for PROV to have a temporal ordering >> relation. However, we have to be >> careful. The relations >> used/generatedBy/derivedFrom/dependedOn/... all have a notion of >> causality/influence: >> the source of the edge being influenced by the edge destination. >> >> We know that causal order implies temporal order, but not the >> converse. I am therefore reluctant >> to introduce a relation that arbitrarily capture temporal >> order. What would it give us? After all, >> we can associate time with >> PEs, and given such time information, we can already decide if >> pe1 start precedes pe2 start, or if pe1 end >> precedes pe2 start. What would a temporal relation give us over time? >> >> The relation wasScheduleAfter attempts to capture some temporal >> ordering, with underpinning >> causal influence. You are incorrect to state that to assert >> wasScheduledAfter you need to know of an agent. >> It's exactly the contrary. By asserting wasScheduledAfter, you >> also assert the existence of such an >> agent, but don't have to specify which it is. >> >> Final point, your reference [1] had not been agreed, it is the >> proposal you made back then. >> >> So, in conclusion: >> 1. I would argue that wasInformedBy is useful, and should be kept >> as such, ... and definitely cannot >> be subsumed by some temporal ordering. >> >> 2. Temporal ordering *with* some form of underpinning causal >> influence, is also useful. I agree that >> wasScheduledAfter is a first attempt. Maybe somebody can put >> forward alternative definitions. >> >> Cheers, >> Luc >> >> >> On 02/10/11 02:03, Satya Sahoo wrote: >>> Hi Luc, >>> I would like to re-raise this issue since the two properties >>> defined in PROV-DM, "wasInformedBy" and "wasScheduledAfter" do >>> not represent the original property for ordering process >>> executions that was agreed to by the provenance incubator group >>> and also during the first F2F [1]. >>> >>> I believe there are primarily two dimensions/constraints for >>> ordering process executions: >>> a) Two PEs are scheduled (by agent/user) to execute in >>> particular order at specific time instants, which we can >>> represent as *time-based ordering of PEs*. Of course, additional >>> information about which agent/user started or stopped the PEs >>> can be specified, but the time value primarily define the >>> ordering of the PEs. >>> >>> b) A PE pe1 is designed to initiate/start a second PE pe2 (due >>> to some condition being satisfied for example a specific state >>> was reached or some entity became available), which we can >>> represent as a *control-based ordering of PEs*. This ordering of >>> process cannot be effectively captured by time-based ordering, >>> since pe1 may still be executing while pe2 starts. >>> >>> Both these cases are captured by the property "wasPrecededBy" >>> (the corresponding property in opposite direction can be >>> "wasSucceededBy") where the PEs were ordered according to their >>> time of start/stop or explicit start/stop by another PE. >>> >>> Some specific comments on the current PROV-DM document >>> Section 5.3.6 Ordering of Process Executions >>> ===== >>> 1. An information flow ordering expression is a representation >>> that a characterized thing was generated by an activity, >>> represented by a process execution expresion, before it was used >>> by another activity, also represented by a process execution >>> expression. >>> >>> Issue: This is a particular case of "time-based ordering", there >>> can multiple others. For example, >>> >>> a) We can have the provenance assertions about two PEs Pe1 and >>> Pe2: Pe1 was stopped at time instant t1 and Pe2 started at time >>> instant t2 and t2 > t1. Hence Pe2 wasPrecededBy Pe1 >>> >>> b) Similarly, we have provenance assertions about two PEs Pe1, >>> Pe2 and an Entity e1: Pe1 used e1 at time t1 and PE2 used e1 at >>> time t2 and t2 > t1, hence (start of) Pe2 wasPrecededBy (start >>> of) Pe1. >>> >>> My suggestion to just create a single generic property for >>> ordering of PEs (Khalid had suggested using PEs instead of >>> Process) and allow specific provenance application to create >>> more specialized PE ordering properties according to their >>> requirements. >>> >>> 2. According to the current definition of "wasScheduledAfter" we >>> cannot assert that one PE was scheduled after another PE if we >>> don't have information about the agent associated with the PEs. >>> Further, the name of the property seems to refer to the intended >>> ordering of PEs rather than actual execution of PEs - a workflow >>> specification may have "scheduled" Pe1 to execute "after" Pe2, >>> but during the workflow run, Pe2 may have executed before Pe1? >>> >>> Overall, I am not sure why we need two very special cases of PE >>> ordering property instead of using a generic "wasPrecededBy" (or >>> "wasSucceededBy") property that can be specialized as needed by >>> different provenance applications. >>> >>> Thanks. >>> >>> Best, >>> Satya >>> >>> [1] >>> http://www.w3.org/2011/prov/wiki/ConsolidatedConcepts#Ordering_of_process_execution >>> >>> On Fri, Sep 23, 2011 at 8:04 AM, Luc Moreau >>> <l.moreau@ecs.soton.ac.uk <mailto:l.moreau@ecs.soton.ac.uk>> wrote: >>> >>> >>> Hi Satya, >>> >>> Issue has been closed pending review, with the latest >>> document version. >>> Feel free to reopen if not appropriate. >>> >>> Luc >>> >>> >>> On 27/07/2011 02:51, Provenance Working Group Issue Tracker >>> wrote: >>> >>> PROV-ISSUE-50 (Ordering of Process): Defintion for >>> Ordering of Process [Conceptual Model] >>> >>> http://www.w3.org/2011/prov/track/issues/50 >>> >>> Raised by: Satya Sahoo >>> On product: Conceptual Model >>> >>> I am not sure where did we get the currently listed >>> definition of "Ordering of Process" - it is neither >>> listed in the original provenance concept page [1] nor >>> in the consolidated concepts page [2]. >>> >>> I had proposed the following definition: >>> "Ordering of processes execution (in provenance) needs >>> to be modeled as a property linking process entities in >>> specific order along a particular dimension (temporal or >>> control flow)" >>> >>> [1]http://www.w3.org/2011/prov/wiki/ConceptOrderingOfProcesses >>> [2] >>> http://www.w3.org/2011/prov/wiki/ConsolidatedConcepts#Ordering_of_process_execution >>> >>> >>> >>> >>> >>> >> -- Professor Luc Moreau Electronics and Computer Science tel: +44 23 8059 4487 University of Southampton fax: +44 23 8059 2865 Southampton SO17 1BJ email: l.moreau@ecs.soton.ac.uk United Kingdom http://www.ecs.soton.ac.uk/~lavm
Received on Wednesday, 30 November 2011 08:46:42 UTC