Re: How to associate a Plan with a Bundle

Hi Jacco,
as an additional comment to what has already been suggested,
you can use the prov:wasInfluencedBy relationship to state that
the Bundle was influenced by the Plan (or you can extend this relation
with :influencedByPlan or similar).

I suggest this because sometimes you may not be willing to record the
activity
that generated the bundle (for example, if it occurs independently of the
workflow
execution, as Stian exposed). Also, using binary relationships makes it
easier
to query the results rather than having to use the indirection through the
activity.

Best,
Daniel

2013/1/30 Stian Soiland-Reyes <soiland-reyes@cs.manchester.ac.uk>

> On Tue, Jan 29, 2013 at 4:37 PM, Timothy Lebo <lebot@rpi.edu> wrote:
> > The key is that the Bundle was generated by the Activity of executing
> the workflow, and the Activity's Association had a Plan that was followed
> to perform the Activity.
>
> This kind of "upper level activity" is very similar to how we include
> this kind of 'meta provenance' in the workflow engine Taverna's
> provenance traces.
>
> Just a slight change ; we make a distinction from the activities of
> exporting provenance and executing the workflow, as export can be done
> at an arbitrary point in the future, while the provenance data is kept
> in an internal database. The reason is also that in Taverna, the
> workflow definition does not say anything about making provenance
> traces or where outputs should be saved.
>
>
> The following is not literally exactly what comes out of the workflow
> engine, as I have still to massage to Sesame's AliBaba to be a bit
> more verbose about class membership and superproperties. (It will not
> bother declaring class membership when it's given by domain/range of a
> property).
>
>
> Note that we use two extensions of PROV:
>
> * <http://purl.org/wf4ever/wfprov#> (An attempt to form a general
> dataflow provenance model, see <http://purl.org/wf4ever/model> for
> pretty picture)
> * <http://ns.taverna.org.uk/2012/tavernaprov/> (which is Taverna
> specialized)
>
> The first might be relevant for you if your workflow system is
> dataflow oriented, but it would need further work if you need to cover
> BPEL-like or Keppler-like workflows.
>
>
> Example, based on
>
> https://github.com/wf4ever/provenance-corpus/blob/master/Taverna_repository/workflow_3152_version_1/run_1/workflowrun.prov.ttl
>
> ### Self-documenting metadata - we don't bother with named graphs for
> this purpose
>
> <> a prov:Bundle ;
>     prov:wasGeneratedBy :taverna-prov-export .
>
> # This is the activity that has generated all the files in
>
> https://github.com/wf4ever/provenance-corpus/tree/master/Taverna_repository/workflow_3152_version_1/run_1
>
> :taverna-prov-export a prov:Activity ;
>         rdfs:label "taverna-prov export of workflow run provenance" ;
>         prov:wasAssociatedWith :taverna-engine ;
>         prov:qualifiedAssociation [
>             a prov:Association ;
>             prov:agent :taverna-engine ;
>             # For the export, the plan is the workflow engine software
>             prov:hadPlan
> <http://ns.taverna.org.uk/2011/software/taverna-2.4.0> .
>         ]
>         prov:startedAtTime "2012-10-05T14:14:08.171+02:00"^^xsd:dateTime ;
>         prov:endedAtTime "2012-10-05T14:14:37.750+02:00"^^xsd:dateTime ;
>         # The link to the workflow run activity which provided the
> provenance data
>         prov:wasInformedBy
> <http://ns.taverna.org.uk/2011/run/f975fc2f-cb00-4421-874f-b123f8d11998/>
> .
>
> #### End of metadata
>
> :taverna-engine a prov:SoftwareAgent, wfprov:WorkflowEngine, foaf:Agent .
>
> # We don't say much about the engine instance at all as we found it
> difficult to identify it (our engine run on desktop computers, servers
> and cloud instances and hence don't easily have a resolvable URI), and
> also difficult to scope (Is it one execution of a particular workflow
> (which technically is how the "engine" is instantiated in Taverna), an
> execution of the software code (an Operating system process), one
> installation on a particular machine, that particular software/plugin
> combination on any machine, or any version of Taverna software?)
>
>
> # The upper workflow run, corresponding to the master workflow
> # We have discussed making an even higher-level activity, which would
> cover "Starting the workflow run" which would not have a plan, and
> would be associated with the user who clicked the Run button and cover
> further actions beyond the workflow definition, such as saving of
> outputs.
>
> <http://ns.taverna.org.uk/2011/run/f975fc2f-cb00-4421-874f-b123f8d11998/>
> a wfprov:WorkflowRun, prov:Activity ;
>         rdfs:label "Workflow run of Workflow21" ;
>         prov:startedAtTime "2012-10-05T14:13:14.921+02:00"^^xsd:dateTime ;
>         prov:endedAtTime "2012-10-05T14:13:17.281+02:00"^^xsd:dateTime ;
>
>         # The wfprov shortcut for showing the plan in the below association
>         wfprov:describedByWorkflow
> <
> http://ns.taverna.org.uk/2010/workflowBundle/9f925cdf-98b9-4034-8d25-6b44b15a4635/workflow/Workflow21/
> >
> ;
>
>         wfprov:wasEnactedBy :taverna-engine ;
>         prov:wasAssociatedWith :taverna-engine ;
>         prov:qualifiedAssociation [
>             a prov:Association ;
>             prov:agent :taverna-engine ;
>             # This is the identifier for the workflow definition.
>             # Note: A workflowBundle is not a prov:Bundle, btw, it's
> just a zip of RDFs
>             prov:hadPlan
> <
> http://ns.taverna.org.uk/2010/workflowBundle/9f925cdf-98b9-4034-8d25-6b44b15a4635/workflow/Workflow21/
> >
> .
>         ] ;
>
>         # The PROV WG recommendation for associating this activity
> with 'sub activities'
>         dcterms:hasPart
> <
> http://ns.taverna.org.uk/2011/run/f975fc2f-cb00-4421-874f-b123f8d11998/process/1c4240de-4217-4fb7-b2f5-11626f584071/
> >
> , <
> http://ns.taverna.org.uk/2011/run/f975fc2f-cb00-4421-874f-b123f8d11998/process/d0a6455a-7c47-42f8-9bc7-8515bcd445aa/
> >
> .
>
>
> # An execution of a particular step/process in the workflow
> <
> http://ns.taverna.org.uk/2011/run/f975fc2f-cb00-4421-874f-b123f8d11998/process/d0a6455a-7c47-42f8-9bc7-8515bcd445aa/
> >
>  a wfprov:ProcessRun, prov:Activity ;
>         rdfs:label "Processor execution Beanshell
> (facade0:Workflow21:Beanshell)" ;
>         prov:startedAtTime "2012-10-05T14:13:17.171+02:00"^^xsd:dateTime ;
>         prov:endedAtTime "2012-10-05T14:13:17.250+02:00"^^xsd:dateTime ;
>
>         # A stronger link to the master run than dcterms:hasPart above
>         wfprov:wasPartOfWorkflowRun
> <http://ns.taverna.org.uk/2011/run/f975fc2f-cb00-4421-874f-b123f8d11998/>
> ;
>
>         wfprov:usedInput
> <
> http://ns.taverna.org.uk/2011/data/f975fc2f-cb00-4421-874f-b123f8d11998/ref/52530175-ee7c-467a-aa7e-137289540b6a
> >
> ;
>         prov:used <
> http://ns.taverna.org.uk/2011/data/f975fc2f-cb00-4421-874f-b123f8d11998/ref/52530175-ee7c-467a-aa7e-137289540b6a
> >
> .
>
>         # Link to the plan (definition) for this particular process
>         # Note: We distinguish between the activity of a particular
> execution of the process,
>         # and the process/service definition, which would remain the same
> across
>         # multiple executions of the same workflow definition.
>         wfprov:describedByProcess
> <
> http://ns.taverna.org.uk/2010/workflowBundle/9f925cdf-98b9-4034-8d25-6b44b15a4635/workflow/Workflow21/processor/Beanshell/
> >
> ;
>         prov:wasAssociatedWith :taverna-engine ;
>         prov:qualifiedAssociation [
>             a prov:Association ;
>             # We consider each step of a workflow to be run by the same
> engine,
>             # not by the parent activity.
>             prov:agent :taverna-engine ;
>             prov:hadPlan
> <
> http://ns.taverna.org.uk/2010/workflowBundle/9f925cdf-98b9-4034-8d25-6b44b15a4635/workflow/Workflow21/processor/Beanshell/
> >
> .
>         ] .
>
>
> This allows us to make workflow-level inputs and outputs to be
> generated/used by both the upper workflow run and the individual
> processes.
>
> <
> http://ns.taverna.org.uk/2011/data/f975fc2f-cb00-4421-874f-b123f8d11998/ref/a2282da2-fc5a-47e5-ad0e-106adaec7715
> >
> a prov:Entity ;
>         tavernaprov:content <Beanshell_startTimeRange.txt> ;
>         wfprov:wasOutputFrom
> <http://ns.taverna.org.uk/2011/run/f975fc2f-cb00-4421-874f-b123f8d11998/>
> , <
> http://ns.taverna.org.uk/2011/run/f975fc2f-cb00-4421-874f-b123f8d11998/process/d0a6455a-7c47-42f8-9bc7-8515bcd445aa/
> >
> ;
>         prov:wasGeneratedBy
> <http://ns.taverna.org.uk/2011/run/f975fc2f-cb00-4421-874f-b123f8d11998/>
> , <
> http://ns.taverna.org.uk/2011/run/f975fc2f-cb00-4421-874f-b123f8d11998/process/d0a6455a-7c47-42f8-9bc7-8515bcd445aa/
> >
> .
>
>
>
> >
> > A quick example:
> >
> > :workflow_plan_1 a bpl:Workflow .  # bpl namespace is made up; I'm not a
> workflow guy.
> >
> > :workflow_engine_5 a bpl:WorkflowEngine .
> >
> > :workflow_execution_17
> >   a prov:Activity;
> >   prov:wasAssociatedWith :workflow_engine_5;
> >   prov:qualifiedAssociation [ # Don't use bnodes in practice.
> >       a prov:Association;
> >       prov:hadPlan :workflow_plan_1;
> >       prov:agent :workflow_engine_5;
> >  ];
> > .
> >
> > :my_bundle {  # Note that the PROV recommendation say nothing about
> _how_ one associates provenance assertions to a bundle. Named graphs is one
> way.
> >     :my_bundle
> >            a prov:Bundle;
> >            prov:wasGeneratedBy :workflow_execution_17;  # This is the
> link from a Bundle to a Plan (via an Activity's Association).
> >     .
> >     :cake a prov:Entity;
> >         prov:wasAttributedTo :jacco . # Whatever your execution engine
> wanted to say….
> > }
> >
> >
> > Regards,
> > Tim
> >
> >
> >
> >> Thanks,
> >>
> >> Jacco
> >>
> >>
> >>
> >>
> >
> >
>
>
>
> --
> Stian Soiland-Reyes, myGrid team
> School of Computer Science
> The University of Manchester
>
>

Received on Wednesday, 30 January 2013 10:24:25 UTC