- From: Stian Soiland-Reyes <soiland-reyes@cs.manchester.ac.uk>
- Date: Wed, 30 Jan 2013 09:59:51 +0000
- To: Jacco van Ossenbruggen <Jacco.van.Ossenbruggen@cwi.nl>
- Cc: public-prov-comments@w3.org
On Tue, Jan 29, 2013 at 4:37 PM, Timothy Lebo <lebot@rpi.edu> wrote: > The key is that the Bundle was generated by the Activity of executing the workflow, and the Activity's Association had a Plan that was followed to perform the Activity. This kind of "upper level activity" is very similar to how we include this kind of 'meta provenance' in the workflow engine Taverna's provenance traces. Just a slight change ; we make a distinction from the activities of exporting provenance and executing the workflow, as export can be done at an arbitrary point in the future, while the provenance data is kept in an internal database. The reason is also that in Taverna, the workflow definition does not say anything about making provenance traces or where outputs should be saved. The following is not literally exactly what comes out of the workflow engine, as I have still to massage to Sesame's AliBaba to be a bit more verbose about class membership and superproperties. (It will not bother declaring class membership when it's given by domain/range of a property). Note that we use two extensions of PROV: * <http://purl.org/wf4ever/wfprov#> (An attempt to form a general dataflow provenance model, see <http://purl.org/wf4ever/model> for pretty picture) * <http://ns.taverna.org.uk/2012/tavernaprov/> (which is Taverna specialized) The first might be relevant for you if your workflow system is dataflow oriented, but it would need further work if you need to cover BPEL-like or Keppler-like workflows. Example, based on https://github.com/wf4ever/provenance-corpus/blob/master/Taverna_repository/workflow_3152_version_1/run_1/workflowrun.prov.ttl ### Self-documenting metadata - we don't bother with named graphs for this purpose <> a prov:Bundle ; prov:wasGeneratedBy :taverna-prov-export . # This is the activity that has generated all the files in https://github.com/wf4ever/provenance-corpus/tree/master/Taverna_repository/workflow_3152_version_1/run_1 :taverna-prov-export a prov:Activity ; rdfs:label "taverna-prov export of workflow run provenance" ; prov:wasAssociatedWith :taverna-engine ; prov:qualifiedAssociation [ a prov:Association ; prov:agent :taverna-engine ; # For the export, the plan is the workflow engine software prov:hadPlan <http://ns.taverna.org.uk/2011/software/taverna-2.4.0> . ] prov:startedAtTime "2012-10-05T14:14:08.171+02:00"^^xsd:dateTime ; prov:endedAtTime "2012-10-05T14:14:37.750+02:00"^^xsd:dateTime ; # The link to the workflow run activity which provided the provenance data prov:wasInformedBy <http://ns.taverna.org.uk/2011/run/f975fc2f-cb00-4421-874f-b123f8d11998/> . #### End of metadata :taverna-engine a prov:SoftwareAgent, wfprov:WorkflowEngine, foaf:Agent . # We don't say much about the engine instance at all as we found it difficult to identify it (our engine run on desktop computers, servers and cloud instances and hence don't easily have a resolvable URI), and also difficult to scope (Is it one execution of a particular workflow (which technically is how the "engine" is instantiated in Taverna), an execution of the software code (an Operating system process), one installation on a particular machine, that particular software/plugin combination on any machine, or any version of Taverna software?) # The upper workflow run, corresponding to the master workflow # We have discussed making an even higher-level activity, which would cover "Starting the workflow run" which would not have a plan, and would be associated with the user who clicked the Run button and cover further actions beyond the workflow definition, such as saving of outputs. <http://ns.taverna.org.uk/2011/run/f975fc2f-cb00-4421-874f-b123f8d11998/> a wfprov:WorkflowRun, prov:Activity ; rdfs:label "Workflow run of Workflow21" ; prov:startedAtTime "2012-10-05T14:13:14.921+02:00"^^xsd:dateTime ; prov:endedAtTime "2012-10-05T14:13:17.281+02:00"^^xsd:dateTime ; # The wfprov shortcut for showing the plan in the below association wfprov:describedByWorkflow <http://ns.taverna.org.uk/2010/workflowBundle/9f925cdf-98b9-4034-8d25-6b44b15a4635/workflow/Workflow21/> ; wfprov:wasEnactedBy :taverna-engine ; prov:wasAssociatedWith :taverna-engine ; prov:qualifiedAssociation [ a prov:Association ; prov:agent :taverna-engine ; # This is the identifier for the workflow definition. # Note: A workflowBundle is not a prov:Bundle, btw, it's just a zip of RDFs prov:hadPlan <http://ns.taverna.org.uk/2010/workflowBundle/9f925cdf-98b9-4034-8d25-6b44b15a4635/workflow/Workflow21/> . ] ; # The PROV WG recommendation for associating this activity with 'sub activities' dcterms:hasPart <http://ns.taverna.org.uk/2011/run/f975fc2f-cb00-4421-874f-b123f8d11998/process/1c4240de-4217-4fb7-b2f5-11626f584071/> , <http://ns.taverna.org.uk/2011/run/f975fc2f-cb00-4421-874f-b123f8d11998/process/d0a6455a-7c47-42f8-9bc7-8515bcd445aa/> . # An execution of a particular step/process in the workflow <http://ns.taverna.org.uk/2011/run/f975fc2f-cb00-4421-874f-b123f8d11998/process/d0a6455a-7c47-42f8-9bc7-8515bcd445aa/> a wfprov:ProcessRun, prov:Activity ; rdfs:label "Processor execution Beanshell (facade0:Workflow21:Beanshell)" ; prov:startedAtTime "2012-10-05T14:13:17.171+02:00"^^xsd:dateTime ; prov:endedAtTime "2012-10-05T14:13:17.250+02:00"^^xsd:dateTime ; # A stronger link to the master run than dcterms:hasPart above wfprov:wasPartOfWorkflowRun <http://ns.taverna.org.uk/2011/run/f975fc2f-cb00-4421-874f-b123f8d11998/> ; wfprov:usedInput <http://ns.taverna.org.uk/2011/data/f975fc2f-cb00-4421-874f-b123f8d11998/ref/52530175-ee7c-467a-aa7e-137289540b6a> ; prov:used <http://ns.taverna.org.uk/2011/data/f975fc2f-cb00-4421-874f-b123f8d11998/ref/52530175-ee7c-467a-aa7e-137289540b6a> . # Link to the plan (definition) for this particular process # Note: We distinguish between the activity of a particular execution of the process, # and the process/service definition, which would remain the same across # multiple executions of the same workflow definition. wfprov:describedByProcess <http://ns.taverna.org.uk/2010/workflowBundle/9f925cdf-98b9-4034-8d25-6b44b15a4635/workflow/Workflow21/processor/Beanshell/> ; prov:wasAssociatedWith :taverna-engine ; prov:qualifiedAssociation [ a prov:Association ; # We consider each step of a workflow to be run by the same engine, # not by the parent activity. prov:agent :taverna-engine ; prov:hadPlan <http://ns.taverna.org.uk/2010/workflowBundle/9f925cdf-98b9-4034-8d25-6b44b15a4635/workflow/Workflow21/processor/Beanshell/> . ] . This allows us to make workflow-level inputs and outputs to be generated/used by both the upper workflow run and the individual processes. <http://ns.taverna.org.uk/2011/data/f975fc2f-cb00-4421-874f-b123f8d11998/ref/a2282da2-fc5a-47e5-ad0e-106adaec7715> a prov:Entity ; tavernaprov:content <Beanshell_startTimeRange.txt> ; wfprov:wasOutputFrom <http://ns.taverna.org.uk/2011/run/f975fc2f-cb00-4421-874f-b123f8d11998/> , <http://ns.taverna.org.uk/2011/run/f975fc2f-cb00-4421-874f-b123f8d11998/process/d0a6455a-7c47-42f8-9bc7-8515bcd445aa/> ; prov:wasGeneratedBy <http://ns.taverna.org.uk/2011/run/f975fc2f-cb00-4421-874f-b123f8d11998/> , <http://ns.taverna.org.uk/2011/run/f975fc2f-cb00-4421-874f-b123f8d11998/process/d0a6455a-7c47-42f8-9bc7-8515bcd445aa/> . > > A quick example: > > :workflow_plan_1 a bpl:Workflow . # bpl namespace is made up; I'm not a workflow guy. > > :workflow_engine_5 a bpl:WorkflowEngine . > > :workflow_execution_17 > a prov:Activity; > prov:wasAssociatedWith :workflow_engine_5; > prov:qualifiedAssociation [ # Don't use bnodes in practice. > a prov:Association; > prov:hadPlan :workflow_plan_1; > prov:agent :workflow_engine_5; > ]; > . > > :my_bundle { # Note that the PROV recommendation say nothing about _how_ one associates provenance assertions to a bundle. Named graphs is one way. > :my_bundle > a prov:Bundle; > prov:wasGeneratedBy :workflow_execution_17; # This is the link from a Bundle to a Plan (via an Activity's Association). > . > :cake a prov:Entity; > prov:wasAttributedTo :jacco . # Whatever your execution engine wanted to say…. > } > > > Regards, > Tim > > > >> Thanks, >> >> Jacco >> >> >> >> > > -- Stian Soiland-Reyes, myGrid team School of Computer Science The University of Manchester
Received on Wednesday, 30 January 2013 10:00:42 UTC