Re: Bundles explained from Mike Loll on 2013-10-29 (public-prov-comments@w3.org from October 2013)

From: Mike Loll <mike.loll@gmail.com>
Date: Tue, 29 Oct 2013 08:15:46 -0400
To: Stian Soiland-Reyes <soiland-reyes@cs.manchester.ac.uk>
Cc: "public-prov-comments@w3.org" <public-prov-comments@w3.org>
Message-ID: <CAHHe3w3gLSXjYOm8ju_NUSVaYersHH=j09yvFWyb5nE_VKtQwA@mail.gmail.com>
Thank you very much.  The answers I've received are great.  I hope I can
contribute back somehow!

--
Mike Loll


On Tue, Oct 29, 2013 at 8:10 AM, Stian Soiland-Reyes <
soiland-reyes@cs.manchester.ac.uk> wrote:

> This is a very good question. I am not sure what this relates to
> bundles - except that perhaps you want to describe the longer-living
> entities in a different bundle (e.g. "the alarm database") from the
> more short-lived entities (e.g. "alarm events this week").
>
>
> In PROV we describe entities as in one way or another being 'static'.
> In your case, there are two abstraction levels of how 'static' an
> alarm is.
>
> <alarm/1> a prov:Entity, ex:Alarm ;
>    prov:atLocation <customer/5> .
>
> We here consider the alarm over its lifetime at a given customer, no
> matter its current state. So we can describe its installation date as
> its provenance:
>
> <alarm/1> prov:generatedAtTime "1984-05-15-02T17:19:41.146" .
>
> We can also of course list properties that are more fluctuating and
> might change during its lifetime:
>
> <alarm/1> ex:currentStatus "active" ;
>       ex:brightness 0.80 ;
>       ex:noiseLevel 0.50 .
>
> If I retrieve the same resource later today, this might instead be:
>
> <alarm/1> ex:currentStatus "disabled" ;
>       ex:brightness 0.20 ;
>       ex:noiseLevel 0.89 .
>
> Now what if we wanted to know how it changed from active to disabled,
> but don't really care about all the possible levels of brightness and
> noise it had inbetween? Then it might make sense to specialize the
> alarm entity to what we would in common language probably just call
> "alarm state". It is still describing the alarm, but at a finer
> granularity. :
>
> <alarm/1/state/123> a prov:Entity, ex:AlarmState ;
>   prov:specializationOf <alarm/1> ;
>   ex:currentStatus "active" ;
>   prov:generatedAtTime "2013-10-28T18:00:00Z"
>   prov:invalidatedAtTime "2013-10-28T23:50:00Z"
>
> <alarm/1/state/124> a prov:Entity, ex:AlarmState ;
>   prov:specializationOf <alarm/1> ;
>   ex:currentStatus "disabled" ;
>   prov:generatedAtTime "2013-10-28T23:50:00Z" .
>
> We might specify a new subclass "ex:AlarmState" that we know 'locks
> down' the state - this would allow different kind of specialization,
> in case you also needed a specialization like ex:BrightnessLog.
>
> Each state has a different generation and invalidation time,
> indicating the life span of the state. This is a continuous span, so
> the alarm state that was disabled last week is different from the
> disabled alarm state today, because the alarm was active in the mean
> time.
>
>
> You might want to organize these states in an order so you don't need
> to compare the start/end timestamps.
>
> <alarm/1/state/124> prov:wasRevisionOf <alarm/1/state/123> .
>
>
> So what if we want to describe who disabled it? A simple solution is
> to now just provide prov:wasAttributedTo at each state:
>
> <alarm/1/state/124> prov:wasAttributedTo <customer/5>.
>
>
> So now we know that customer/5 caused the alarm to be disabled somehow
> (it was not a supervisor at the security company).
>
> If you want to detail this more, say to record how the customer did
> this (e.g. clicking the alarm panel) - then you can introduce an
> activity to describe the transition:
>
> <alarm/1/state/123> prov:wasInvalidatedBy <activities/987> ;
> <alarm/1/state/124> prov:wasGeneratedBy <activities/987> .
>
> <activities/987> a prov:Activity, ex:AlarmPanelAction ;
>     prov:wasAssociatedWith <customer/5> .
>
>
> Now as to get back to the bundles - if you have separate bundles per
> week for instance of alarm activities, then you could refer back to
> the original bundle in your specialization as we say in
> http://www.w3.org/TR/prov-links/ :
>
> <alarm/1/state/124> prov:mentionOf <alarm/1> ;
>   prov:asInBundle <http://example.com/alarms> .
>
>
> In a way this is just a more formal way of saying:
>
> <alarm/1/state/124> prov:specializationOf <alarm/1> .
> <alarm/1> prov:has_provenance <http://example.com/alarms> .
>
> (using has_provenance from http://www.w3.org/TR/prov-aq/ ):
>
> as the mentionOf/asInBundle adds the  additional promise that you will
> find <alarm/1> described as an entity inside that bundle.
>
>
>
> On 29 October 2013 11:26, Mike Loll <mike.loll@gmail.com> wrote:
> > Thanks, Stian.
> >
> > My understanding is that an entity referenced in a bundle (e.g. via
> > wasGeneratedBy) must be in the bundle...but I do not wish to duplicate
> > entity definitions through out my bundles.  My entities are long lived
> and
> > will exist in multiple bundles.
> >
> > So lets say I have a resource for alarms which contains a list of all
> alarms
> > my company monitors.  If I turn off the alarm at /alarm/1, my
> understanding
> > is that in prov a new entity is created for the new state of /alarm/1.
>  But
> > in my actual data store, I don't create a new record, I just toggle a
> flag.
> >
> > So there is a disconnect between how my prov looks and how my data looks.
> > This is by design is my understanding.  So I would have a new entity in
> my
> > prov for the /alarm/1 in the new state which is a specialization of
> > /alarm/1, yes?
> >
> > Ultimately, I want to display all of the provenance for /alarm/1 so I can
> > see its history from creation to invalidation.  Am I going about this the
> > wrong way?
> >
> >
> > --
> > Mike Loll
> >
> >
> > On Mon, Oct 28, 2013 at 9:54 AM, Stian Soiland-Reyes
> > <soiland-reyes@cs.manchester.ac.uk> wrote:
> >>
> >> Hi!
> >>
> >> I would say that any resource that contains provenance statements (in
> >> particular PRO statements) is a prov:Bundle. However that fact might
> >> not be recorded anywhere, and it would generally only be used as a
> >> term when you want to describe provenance of provenance records, or if
> >> you are cataloguing provenance traces.
> >>
> >>
> >> In my application I report the provenance of a scientific workflow run.
> >>
> >> When I save this provenance to a file, it includes its own
> >> meta-provenance so you can tell how and when this file was recorded,
> >> as it could have been saved from the internal database at an arbitrary
> >> time after the run.
> >>
> >> In RDF this is normally quite easy by simply describing the relative
> >> URI <> which would mean "this document" - wherever it ends up being
> >> located:
> >>
> >>
> >> <> a prov:Bundle ;
> >>       foaf:primaryTopic
> >> <
> http://ns.taverna.org.uk/2011/run/5e93cdba-27ec-4757-addf-fc91be12c7a4/>
> >> ;
> >>       prov:wasGeneratedBy       <#taverna-prov-export> .
> >>
> >> <#taverna-prov-export> a prov:Activity ;
> >>        rdfs:label                   "taverna-prov export of workflow
> >> run provenance"@en ;
> >>         prov:startedAtTime
> >> "2013-09-02T15:22:25.961Z"^^xsd:dateTime ;
> >>         prov:endedAtTime
> >> "2013-09-02T15:22:30.89Z"^^xsd:dateTime ;
> >>         prov:wasInformedBy
> >> <
> http://ns.taverna.org.uk/2011/run/5e93cdba-27ec-4757-addf-fc91be12c7a4/>
> >> ;
> >>         prov:wasAssociatedWith       <#taverna-engine> ;
> >>         prov:qualifiedAssociation    [ a prov:Association ;
> >>             prov:hadPlan
> >> <http://ns.taverna.org.uk/2011/software/taverna-2.4.0> ;
> >>             prov:agent    <#taverna-engine>
> >>         ] .
> >>
> >> <
> http://ns.taverna.org.uk/2011/run/5e93cdba-27ec-4757-addf-fc91be12c7a4/>
> >> a prov:Activity, wfprov:WorkflowRun ;
> >>    rdfs:label                  "Workflow run of
> GWAS_to_biomedical_c"@en ;
> >>    prov:startedAtTime
> >> "2013-09-02T17:19:31.676+02:00"^^xsd:dateTime ;
> >>    prov:endedAtTime
>  "2013-09-02T17:20:00.662+02:00"^^xsd:dateTime
> >> .
> >>
> >> # .. followed by the actual workflow run provenance with many more
> >> activities and nested wfprov:WorkflowRuns
> >>
> >>
> >> I am not saying that everyone should include this meta-provenance as
> >> in many cases it would be self-evident or not relevant - but in my
> >> case it is important for three reasons.
> >>
> >> 1 - I can see the version of the software used to generate the
> >> provenance (as I am still developing that)
> >> 2 - I can see when provenance was exported compared to when it was
> >> run. In this case just 2 minutes after - and hence I can be fairly
> >> certain about statements the provenance trace makes about generated
> >> files etc.
> >> 3 - I use foaf:primaryTopic (my own convention - which makes <> also
> >> be a foaf:Document) to find the top-level "starting point" of the
> >> provenance. (this is also indicated with the slightly weaker relation
> >> prov:wasInformedBy)
> >>
> >>
> >>
> >> On 28 October 2013 11:26, Mike Loll <mike.loll@gmail.com> wrote:
> >> > I'm having some difficulty wrapping my head around when bundles would
> be
> >> > used.  Is it so we can describe how a set of provenance records came
> to
> >> > be
> >> > (the provenance of the provenance)?
> >> >
> >> > I'm having a little difficulty wrapping my head around the use cases.
> >> >
> >> > Example 40 from
> >> > http://www.w3.org/TR/2013/REC-prov-dm-20130430/#component4
> >> > shows two reports (r1, r2) being generated with r2 derived from r1.
>  It
> >> > then
> >> > describes a bundle describing that "Bob" witnessed r1 being generated.
> >> > The
> >> > example goes on to show a bundle for "Alice" observing the generation
> of
> >> > r2.
> >> >
> >> > How is this useful?  I think my real question is shouldn't all
> >> > provenance
> >> > events be contained in a bundle?
> >> >
> >> > Any insight is appreciated.
> >> >
> >> > I'm working on a clojure implementation of the provenance model as an
> >> > exercise and I want to be sure I have my understanding set.
> >> >
> >> > Thanks.
> >> >
> >> >
> >> > --
> >> > Mike Loll
> >>
> >>
> >>
> >> --
> >> Stian Soiland-Reyes, myGrid team
> >> School of Computer Science
> >> The University of Manchester
> >> http://soiland-reyes.com/stian/work/
> http://orcid.org/0000-0001-9842-9718
> >
> >
>
>
>
> --
> Stian Soiland-Reyes, myGrid team
> School of Computer Science
> The University of Manchester
> http://soiland-reyes.com/stian/work/ http://orcid.org/0000-0001-9842-9718
>
Received on Tuesday, 29 October 2013 12:16:19 UTC