- From: Satya Sahoo <satya.sahoo@case.edu>
- Date: Wed, 11 Jan 2012 13:03:12 -0500
- To: Luc Moreau <L.Moreau@ecs.soton.ac.uk>
- Cc: public-prov-wg@w3.org
- Message-ID: <CAOMwk6w13ShP++69ks48KyFDzU6CXzo7xv15t-0ZQc=ZG1UbNQ@mail.gmail.com>
Hi Luc, Since the points raised in this is issue have been superseded by updates to DM, I am comfortable in closing this issue. Thanks. Best, Satya On Wed, Nov 30, 2011 at 3:45 AM, Luc Moreau <L.Moreau@ecs.soton.ac.uk>wrote: > ** > Hi Satya, > The discussion on this thread has not progressed since earlier October. > > The latest WD contains a new relation wasStartedBy between activities, > which is > simpler than wasScheduledAfter. > > For the second time, I am proposing to formally close this issue. > > Best regards, > Luc > > > On 10/03/2011 08:05 AM, Luc Moreau wrote: > > Hi Satya, > > Responses interleaved. > > On 03/10/11 01:54, Satya Sahoo wrote: > > Hi Luc, > My comments are inline: > >First, you will note that wasInformedBy is *not* a temporal relation > between process executions. > > The PROV-DM currently defines the following constraint for wasInformedBy: > Given two process execution expressions denoted by pe1 and pe2, the > expression wasInformedBy(pe2,pe1) holds, if and only if there is an > entity expression denoted by e and qualifiers q1 and q2, such that > wasGeneratedBy(e,pe1,q1) and used(pe2,e,q2) hold. > > If we consider the two expressions wasGeneratedBy(e, pe1, q1) and > used(pe2, e, q2) - these two expressions together enforce that pe2 cannot > have start time that is "before" start time of pe1. This is temporal > relation/ordering between pe1 and pe2. Hence, if both these expressions > have to "hold" for wasInformedBy(pe2, pe1) to "hold" I am not sure how it > is not a temporal ordering? > > > I agree that some temporal constraints have to be satisfied for > wasInformedBy(pe2, pe1), but it's a necessary condition, > it's not a sufficient condition. Information (represented as entity e > above) is required to flow between process executions. > > Also, it's not a temporal order, but it's a temporal relation! It is not > transitive! > > For these reasons (information flow and non transitivity), I feel that > wasInformedBy does not fall under > your temporal ordering classification. > > > > >Second, it would be nice for PROV to have a temporal ordering relation. > However, we have to be > >careful. The relations used/generatedBy/derivedFrom/dependedOn/... all > have a notion of >causality/influence: the source of the edge being > influenced by the edge destination. > >We know that causal order implies temporal order, but not the converse. > I am therefore reluctant > >to introduce a relation that arbitrarily capture temporal order. What > would it give us? After all, > >we can associate time with PEs, and given such time information, we can > already decide if pe1 >start precedes pe2 start, or if pe1 end precedes pe2 > start. What would a temporal relation give us >over time? > There are many non-causal properties that are part of provenance > assertions. > > For example, to reconstruct the history of activities of an accused > person X on Oct 2 before the X reached the crime scene, the police make the > following assertions: > 1. X bought a car at 2:00pm US ET - buying the car is PE pe1 > 2. X bought flowers at 4:00pm US ET- buying flowers is PE pe2 > 3. X hailed a taxi and travelled to crime scene at 6:00pm US ET - > travelling in taxi is PE pe3 > > > This is nice example where wasScheduledAfter can be used! > > > In the above scenario, the police need to have temporal ordering of PEs to > establish that person X was in the city on the day of the crime but there > is no causal relation between pe1, pe2, and pe3. > > > There is some underpinning ordering, since there is X at 2pm, X at 4pm, > and X at 6pm. > This is exactly the definition of wasScheduledAfter. > > > As you stated, temporal ordering may or may not represent causal relation > between PEs and since non-causal ordering of PEs occur in many provenance > applications we need to define a property for temporal ordering of PEs and > causality-based temporal ordering is a specialization of that property. > > > >The relation wasScheduleAfter attempts to capture some temporal ordering, > with underpinning > >causal influence. You are incorrect to state that to assert > wasScheduledAfter you need to know >of an agent. It's exactly the contrary. > By asserting wasScheduledAfter, you also assert the >existence of such > an agent, but don't have to specify which it is. > > The PROV-DM currently defines the following constraint > for wasScheduledAfter: > Given two process execution expressions denoted by pe1 and pe2, the > expression wasScheduledAfter(pe2,pe1) holds, if and only if there are two > entity expressions denoted by e1 and e2, such that > wasControlledBy(pe1,e1,qualifier(role="end")) and > wasControlledBy(pe2,e2,qualifier(role="start")) and wasDerivedFrom(e2,e1) > . > and > This definition assumes that the activities represented by process > execution expressions identified by pe1 and pe2 are controlled by some > agents, represented by expressions identified by e1 and e2, where the > first agent terminates (control qualifier qualifier(role="end")) the > first activity, and the second initiates (control qualifier > qualifier(role="start")) the second. The second agent being "derived" > from the first enforces temporal ordering. If we don't know which are the > Agents associated with pe1 and pe2 then how can we state that they are > entities with identifiers e1 and e2? > > In other words, if there are two PEs (from Taverna workflows) - > retrieveGeneSequence and runBLASTService and John (the research robot) > ended retrieveGeneSequence and Tom (the research robot - derived from John) > started runBLASTService - then we can assert that runBLASTService > wasScheduledAfter retrieveGeneSequence. > > But, if don't know which Agents are associated with retrieveGeneSequence > and runBLASTService PEs then how can we assert wasScheduledAfter property > between the two PEs? > > > You will note that the constraint you copied contains "if and only if", so > it is defining the expression wasScheduledAfter(pe2,pe1). > It is therefore fine to assert it. The existential quantifier states the > existence of agents, but when asserting wasScheduledAfter > you don't need to know their identity. Vice-versa, if you know them and > all other constraints are satisfied, than you can infer > a WasScheduledAfter expression. > > > There maybe a third robot Albert and it is not related to either Tom or > John by wasDerivedFrom property. But, a provenance application has to know > which of three robots (agents) are associated with the two PEs (and then > verify that there is a wasDerivedFrom property linking the two robots). > > The constraint defined for wasScheduledAfter is a rule and for the rule > to "fire" its conditions have to evaluate to "true". > > Just knowing that there exist some Agent associated > with retrieveGeneSequence and runBLASTService PEs will not make the > constraint evaluate to "true" - the provenance application has to specify > which Agents (John and Tom) were associated with the two PEs. > > Hence, according to the current PROV-DM text, my understanding is that a > provenance application will need to know about the specific agents > associated with PEs before they can use the wasScheduledAfter property. > This information may or may not be available to a provenance application. > > Therefore I am raising the need for a generic ordering property for PEs > that can be simply asserted by provenance applications. Similar to other > provenance assertions the ordering of PEs can be verified later using > either timestamps or causal relations constraints. > > > You have not answered my point. What does this give you that you can't > infer from time information? > > > >Final point, your reference [1] had not been agreed, it is the proposal > you made back then. > Hence, I had raised this issue (Issue-50) to discuss the property. To > clarify, has there been discussions or agreement on the two properties > isInformedBy and wasScheduledAfter (I may have missed the particular mails > in the mailing list)? > > > To my knowledge, this thread is the only one discussing these issues. As > Paul indicated a while back, the proposal > is aligned with the rest of the document. > > I would like to see you putting a proper definition of the concept you > would have in mind. I would argue > that your original text in [1] is not a definition but a requirement to be > satisfied. Can you define this notion of temporal order in > terms of the other "building block" of PROV (e.g. process start/end etc). > > Ultimately, we could introduce Allen's relations ( > http://en.wikipedia.org/wiki/Allen's_Interval_Algebra) > but I am not sure it would be helpful in this context. > > Cheers, > Luc > > > Thanks. > > Best, > Satya > > On Sun, Oct 2, 2011 at 9:58 AM, Luc Moreau <L.Moreau@ecs.soton.ac.uk>wrote: > >> Hi Satya, >> >> First, you will note that wasInformedBy is *not* a temporal relation >> between process executions. >> It is *not* transitive. It requires information to flow between two >> PEs. For wasInformedBy(pe1,pe2), >> a minimum constraint is that the end of pe2 does *not* precede the start >> of pe1. >> The data journalism example had an illustration of such relation. It has >> been established to be useful >> both theoretically and practically. >> >> Second, it would be nice for PROV to have a temporal ordering relation. >> However, we have to be >> careful. The relations used/generatedBy/derivedFrom/dependedOn/... all >> have a notion of causality/influence: >> the source of the edge being influenced by the edge destination. >> >> We know that causal order implies temporal order, but not the converse. >> I am therefore reluctant >> to introduce a relation that arbitrarily capture temporal order. What >> would it give us? After all, >> we can associate time with >> PEs, and given such time information, we can already decide if pe1 start >> precedes pe2 start, or if pe1 end >> precedes pe2 start. What would a temporal relation give us over time? >> >> The relation wasScheduleAfter attempts to capture some temporal ordering, >> with underpinning >> causal influence. You are incorrect to state that to assert >> wasScheduledAfter you need to know of an agent. >> It's exactly the contrary. By asserting wasScheduledAfter, you also >> assert the existence of such an >> agent, but don't have to specify which it is. >> >> Final point, your reference [1] had not been agreed, it is the proposal >> you made back then. >> >> So, in conclusion: >> 1. I would argue that wasInformedBy is useful, and should be kept as >> such, ... and definitely cannot >> be subsumed by some temporal ordering. >> >> 2. Temporal ordering *with* some form of underpinning causal influence, >> is also useful. I agree that >> wasScheduledAfter is a first attempt. Maybe somebody can put forward >> alternative definitions. >> >> Cheers, >> Luc >> >> >> On 02/10/11 02:03, Satya Sahoo wrote: >> >> Hi Luc, >> I would like to re-raise this issue since the two properties defined in >> PROV-DM, "wasInformedBy" and "wasScheduledAfter" do not represent the >> original property for ordering process executions that was agreed to by the >> provenance incubator group and also during the first F2F [1]. >> >> I believe there are primarily two dimensions/constraints for ordering >> process executions: >> a) Two PEs are scheduled (by agent/user) to execute in particular order >> at specific time instants, which we can represent as *time-based >> ordering of PEs*. Of course, additional information about which >> agent/user started or stopped the PEs can be specified, but the time value >> primarily define the ordering of the PEs. >> >> b) A PE pe1 is designed to initiate/start a second PE pe2 (due to some >> condition being satisfied for example a specific state was reached or some >> entity became available), which we can represent as a *control-based >> ordering of PEs*. This ordering of process cannot be effectively >> captured by time-based ordering, since pe1 may still be executing while pe2 >> starts. >> >> Both these cases are captured by the property "wasPrecededBy" (the >> corresponding property in opposite direction can be "wasSucceededBy") where >> the PEs were ordered according to their time of start/stop or explicit >> start/stop by another PE. >> >> Some specific comments on the current PROV-DM document Section 5.3.6 >> Ordering of Process Executions >> ===== >> 1. An information flow ordering expression is a representation that a >> characterized thing was generated by an activity, represented by a process >> execution expresion, before it was used by another activity, also >> represented by a process execution expression. >> >> Issue: This is a particular case of "time-based ordering", there can >> multiple others. For example, >> >> a) We can have the provenance assertions about two PEs Pe1 and Pe2: Pe1 >> was stopped at time instant t1 and Pe2 started at time instant t2 and t2 > >> t1. Hence Pe2 wasPrecededBy Pe1 >> >> b) Similarly, we have provenance assertions about two PEs Pe1, Pe2 and >> an Entity e1: Pe1 used e1 at time t1 and PE2 used e1 at time t2 and t2 > >> t1, hence (start of) Pe2 wasPrecededBy (start of) Pe1. >> >> My suggestion to just create a single generic property for ordering of >> PEs (Khalid had suggested using PEs instead of Process) and allow specific >> provenance application to create more specialized PE ordering properties >> according to their requirements. >> >> 2. According to the current definition of "wasScheduledAfter" we cannot >> assert that one PE was scheduled after another PE if we don't have >> information about the agent associated with the PEs. Further, the name of >> the property seems to refer to the intended ordering of PEs rather than >> actual execution of PEs - a workflow specification may have "scheduled" Pe1 >> to execute "after" Pe2, but during the workflow run, Pe2 may have executed >> before Pe1? >> >> Overall, I am not sure why we need two very special cases of PE >> ordering property instead of using a generic "wasPrecededBy" (or >> "wasSucceededBy") property that can be specialized as needed by different >> provenance applications. >> >> Thanks. >> >> Best, >> Satya >> >> [1] >> http://www.w3.org/2011/prov/wiki/ConsolidatedConcepts#Ordering_of_process_execution >> >> On Fri, Sep 23, 2011 at 8:04 AM, Luc Moreau <l.moreau@ecs.soton.ac.uk>wrote: >> >>> >>> Hi Satya, >>> >>> Issue has been closed pending review, with the latest document version. >>> Feel free to reopen if not appropriate. >>> >>> Luc >>> >>> >>> On 27/07/2011 02:51, Provenance Working Group Issue Tracker wrote: >>> >>>> PROV-ISSUE-50 (Ordering of Process): Defintion for Ordering of Process >>>> [Conceptual Model] >>>> >>>> http://www.w3.org/2011/prov/track/issues/50 >>>> >>>> Raised by: Satya Sahoo >>>> On product: Conceptual Model >>>> >>>> I am not sure where did we get the currently listed definition of >>>> "Ordering of Process" - it is neither listed in the original provenance >>>> concept page [1] nor in the consolidated concepts page [2]. >>>> >>>> I had proposed the following definition: >>>> "Ordering of processes execution (in provenance) needs to be modeled as >>>> a property linking process entities in specific order along a particular >>>> dimension (temporal or control flow)" >>>> >>>> [1]http://www.w3.org/2011/prov/wiki/ConceptOrderingOfProcesses >>>> [2] >>>> http://www.w3.org/2011/prov/wiki/ConsolidatedConcepts#Ordering_of_process_execution >>>> >>>> >>>> >>>> >>>> >>> >>> >> > > -- > Professor Luc Moreau > Electronics and Computer Science tel: +44 23 8059 4487 > University of Southampton fax: +44 23 8059 2865 > Southampton SO17 1BJ email: l.moreau@ecs.soton.ac.uk > United Kingdom http://www.ecs.soton.ac.uk/~lavm > >
Received on Wednesday, 11 January 2012 18:08:49 UTC