Re: Bundles explained from Stian Soiland-Reyes on 2013-10-28 (public-prov-comments@w3.org from October 2013)

From: Stian Soiland-Reyes <soiland-reyes@cs.manchester.ac.uk>
Date: Mon, 28 Oct 2013 13:54:05 +0000
To: Mike Loll <mike.loll@gmail.com>
Cc: "public-prov-comments@w3.org" <public-prov-comments@w3.org>
Message-ID: <CAPRnXtkShAz9pV2sAbamQc8-BWjR0DnMUPQ_NQvT2qjJDMbFTw@mail.gmail.com>

Hi!

I would say that any resource that contains provenance statements (in
particular PRO statements) is a prov:Bundle. However that fact might
not be recorded anywhere, and it would generally only be used as a
term when you want to describe provenance of provenance records, or if
you are cataloguing provenance traces.

In my application I report the provenance of a scientific workflow run.

When I save this provenance to a file, it includes its own
meta-provenance so you can tell how and when this file was recorded,
as it could have been saved from the internal database at an arbitrary
time after the run.

In RDF this is normally quite easy by simply describing the relative
URI <> which would mean "this document" - wherever it ends up being
located:

<> a prov:Bundle ;
      foaf:primaryTopic
<http://ns.taverna.org.uk/2011/run/5e93cdba-27ec-4757-addf-fc91be12c7a4/>
;
      prov:wasGeneratedBy       <#taverna-prov-export> .

<#taverna-prov-export> a prov:Activity ;
       rdfs:label                   "taverna-prov export of workflow
run provenance"@en ;
        prov:startedAtTime
"2013-09-02T15:22:25.961Z"^^xsd:dateTime ;
        prov:endedAtTime             "2013-09-02T15:22:30.89Z"^^xsd:dateTime ;
        prov:wasInformedBy
<http://ns.taverna.org.uk/2011/run/5e93cdba-27ec-4757-addf-fc91be12c7a4/>
;
        prov:wasAssociatedWith       <#taverna-engine> ;
        prov:qualifiedAssociation    [ a prov:Association ;
            prov:hadPlan
<http://ns.taverna.org.uk/2011/software/taverna-2.4.0> ;
            prov:agent    <#taverna-engine>
        ] .

<http://ns.taverna.org.uk/2011/run/5e93cdba-27ec-4757-addf-fc91be12c7a4/>
a prov:Activity, wfprov:WorkflowRun ;
   rdfs:label                  "Workflow run of GWAS_to_biomedical_c"@en ;
   prov:startedAtTime          "2013-09-02T17:19:31.676+02:00"^^xsd:dateTime ;
   prov:endedAtTime          "2013-09-02T17:20:00.662+02:00"^^xsd:dateTime .

# .. followed by the actual workflow run provenance with many more
activities and nested wfprov:WorkflowRuns

I am not saying that everyone should include this meta-provenance as
in many cases it would be self-evident or not relevant - but in my
case it is important for three reasons.

1 - I can see the version of the software used to generate the
provenance (as I am still developing that)
2 - I can see when provenance was exported compared to when it was
run. In this case just 2 minutes after - and hence I can be fairly
certain about statements the provenance trace makes about generated
files etc.
3 - I use foaf:primaryTopic (my own convention - which makes <> also
be a foaf:Document) to find the top-level "starting point" of the
provenance. (this is also indicated with the slightly weaker relation
prov:wasInformedBy)

On 28 October 2013 11:26, Mike Loll <mike.loll@gmail.com> wrote:
> I'm having some difficulty wrapping my head around when bundles would be
> used.  Is it so we can describe how a set of provenance records came to be
> (the provenance of the provenance)?
>
> I'm having a little difficulty wrapping my head around the use cases.
>
> Example 40 from http://www.w3.org/TR/2013/REC-prov-dm-20130430/#component4
> shows two reports (r1, r2) being generated with r2 derived from r1.  It then
> describes a bundle describing that "Bob" witnessed r1 being generated.  The
> example goes on to show a bundle for "Alice" observing the generation of r2.
>
> How is this useful?  I think my real question is shouldn't all provenance
> events be contained in a bundle?
>
> Any insight is appreciated.
>
> I'm working on a clojure implementation of the provenance model as an
> exercise and I want to be sure I have my understanding set.
>
> Thanks.
>
>
> --
> Mike Loll

-- 
Stian Soiland-Reyes, myGrid team
School of Computer Science
The University of Manchester
http://soiland-reyes.com/stian/work/ http://orcid.org/0000-0001-9842-9718

Received on Monday, 28 October 2013 13:54:53 UTC