Re: PROV-ISSUE-50 (Ordering of Process): Defintion for Ordering of Process [Conceptual Model]

Hi Satya,

First, you will note that wasInformedBy is *not* a temporal relation 
between process executions.
It is *not* transitive.  It requires information to flow between two 
PEs.  For wasInformedBy(pe1,pe2),
a minimum constraint is that the end of pe2 does *not* precede the start 
of pe1.
The data journalism example had an illustration of such relation. It has 
been established to be useful
both theoretically and practically.

Second, it would be nice for PROV to have a temporal ordering relation. 
However, we have to be
careful. The relations used/generatedBy/derivedFrom/dependedOn/... all 
have a notion of causality/influence:
the source of the edge being influenced by the edge destination.

We know that causal order implies temporal order, but not the converse.  
I am therefore reluctant
to introduce a relation that arbitrarily capture  temporal order.  What 
would it give us? After all,
we can associate time with
PEs, and given such time information, we can already decide if pe1 start 
precedes pe2 start, or if pe1 end
precedes pe2 start. What would a temporal relation give us over time?

The relation wasScheduleAfter attempts to capture some temporal 
ordering, with underpinning
causal influence.  You are incorrect to state that to assert 
wasScheduledAfter you need to know of an agent.
It's exactly the contrary. By asserting wasScheduledAfter, you also 
assert the existence of such an
agent, but don't have to specify which it is.

Final point, your reference [1] had not been agreed, it is the proposal 
you made back then.

So, in conclusion:
1. I would argue that wasInformedBy is useful, and should be kept as 
such, ... and definitely cannot
    be subsumed by some temporal ordering.

2. Temporal ordering *with* some form of underpinning causal influence, 
is also useful. I agree that
    wasScheduledAfter is a first attempt. Maybe somebody can put forward 
alternative definitions.


On 02/10/11 02:03, Satya Sahoo wrote:
> Hi Luc,
> I would like to re-raise this issue since the two properties defined 
> in PROV-DM, "wasInformedBy" and "wasScheduledAfter" do not represent 
> the original property for ordering process executions that was agreed 
> to by the provenance incubator group and also during the first F2F [1].
> I believe there are primarily two dimensions/constraints for ordering 
> process executions:
> a) Two PEs are scheduled (by agent/user) to execute in particular 
> order at specific time instants, which we can represent as *time-based 
> ordering of PEs*. Of course, additional information about which 
> agent/user started or stopped the PEs can be specified, but the time 
> value primarily define the ordering of the PEs.
> b) A PE pe1 is designed to initiate/start a second PE pe2 (due to some 
> condition being satisfied for example a specific state was reached or 
> some entity became available), which we can represent as a 
> *control-based ordering of PEs*. This ordering of process cannot be 
> effectively captured by time-based ordering, since pe1 may still be 
> executing while pe2 starts.
> Both these cases are captured by the property "wasPrecededBy" (the 
> corresponding property in opposite direction can be "wasSucceededBy") 
> where the PEs were ordered according to their time of start/stop or 
> explicit start/stop by another PE.
> Some specific comments on the current PROV-DM document Section 5.3.6 
> Ordering of Process Executions
> =====
> 1. An information flow ordering expression is a representation that a 
> characterized thing was generated by an activity, represented by a 
> process execution expresion, before it was used by another activity, 
> also represented by a process execution expression.
> Issue: This is a particular case of "time-based ordering", there can 
> multiple others. For example,
> a) We can have the provenance assertions about two PEs Pe1 and Pe2: 
> Pe1 was stopped at time instant t1 and Pe2 started at time instant t2 
> and t2 > t1. Hence Pe2 wasPrecededBy Pe1
> b) Similarly, we have provenance assertions about two PEs Pe1, Pe2 and 
> an Entity e1: Pe1 used e1 at time t1 and PE2 used e1 at time t2 and t2 
> > t1, hence (start of) Pe2 wasPrecededBy (start of) Pe1.
> My suggestion to just create a single generic property for ordering of 
> PEs (Khalid had suggested using PEs instead of Process) and allow 
> specific provenance application to create more specialized PE ordering 
> properties according to their requirements.
> 2. According to the current definition of "wasScheduledAfter" we 
> cannot assert that one PE was scheduled after another PE if we don't 
> have information about the agent associated with the PEs. Further, the 
> name of the property seems to refer to the intended ordering of PEs 
> rather than actual execution of PEs - a workflow specification may 
> have "scheduled" Pe1 to execute "after" Pe2, but during the workflow 
> run, Pe2 may have executed before Pe1?
> Overall, I am not sure why we need two very special cases of PE 
> ordering property instead of using a generic "wasPrecededBy" (or 
> "wasSucceededBy") property that can be specialized as needed by 
> different provenance applications.
> Thanks.
> Best,
> Satya
> [1] 
> On Fri, Sep 23, 2011 at 8:04 AM, Luc Moreau < 
> <>> wrote:
>     Hi Satya,
>     Issue has been closed pending review, with the latest document
>     version.
>     Feel free to reopen if not appropriate.
>     Luc
>     On 27/07/2011 02:51, Provenance Working Group Issue Tracker wrote:
>         PROV-ISSUE-50 (Ordering of Process): Defintion for Ordering of
>         Process [Conceptual Model]
>         Raised by: Satya Sahoo
>         On product: Conceptual Model
>         I am not sure where did we get the currently listed definition
>         of "Ordering of Process" - it is neither listed in the
>         original provenance concept page [1] nor in the consolidated
>         concepts page [2].
>         I had proposed the following definition:
>         "Ordering of processes execution (in provenance) needs to be
>         modeled as a property linking process entities in specific
>         order along a particular dimension (temporal or control flow)"
>         [1]
>         [2]

Received on Sunday, 2 October 2011 13:58:41 UTC