- From: Stian Soiland-Reyes <soiland-reyes@cs.manchester.ac.uk>
- Date: Tue, 29 Oct 2013 15:22:19 +0000
- To: Mike Loll <mike.loll@gmail.com>, "public-prov-comments@w3.org" <public-prov-comments@w3.org>
See http://practicalprovenance.wordpress.com/2013/10/29/resources-that-change-state/ On 29 October 2013 12:19, Stian Soiland-Reyes <soiland-reyes@cs.manchester.ac.uk> wrote: > Cheers. A Clojure implementation sounds very promising and something I > would personally be interested in! > > You might want to also look at the PROV Toolbox for Java - so you > don't need to focus on the different serializations - see > https://github.com/lucmoreau/ProvToolbox/ > > > Can I refer to your questions and your name in a blog post? I am > thinking to write up this for my blog at > https://practicalprovenance.wordpress.com > > > > On 29 October 2013 12:15, Mike Loll <mike.loll@gmail.com> wrote: >> Thank you very much. The answers I've received are great. I hope I can >> contribute back somehow! >> >> -- >> Mike Loll >> >> >> On Tue, Oct 29, 2013 at 8:10 AM, Stian Soiland-Reyes >> <soiland-reyes@cs.manchester.ac.uk> wrote: >>> >>> This is a very good question. I am not sure what this relates to >>> bundles - except that perhaps you want to describe the longer-living >>> entities in a different bundle (e.g. "the alarm database") from the >>> more short-lived entities (e.g. "alarm events this week"). >>> >>> >>> In PROV we describe entities as in one way or another being 'static'. >>> In your case, there are two abstraction levels of how 'static' an >>> alarm is. >>> >>> <alarm/1> a prov:Entity, ex:Alarm ; >>> prov:atLocation <customer/5> . >>> >>> We here consider the alarm over its lifetime at a given customer, no >>> matter its current state. So we can describe its installation date as >>> its provenance: >>> >>> <alarm/1> prov:generatedAtTime "1984-05-15-02T17:19:41.146" . >>> >>> We can also of course list properties that are more fluctuating and >>> might change during its lifetime: >>> >>> <alarm/1> ex:currentStatus "active" ; >>> ex:brightness 0.80 ; >>> ex:noiseLevel 0.50 . >>> >>> If I retrieve the same resource later today, this might instead be: >>> >>> <alarm/1> ex:currentStatus "disabled" ; >>> ex:brightness 0.20 ; >>> ex:noiseLevel 0.89 . >>> >>> Now what if we wanted to know how it changed from active to disabled, >>> but don't really care about all the possible levels of brightness and >>> noise it had inbetween? Then it might make sense to specialize the >>> alarm entity to what we would in common language probably just call >>> "alarm state". It is still describing the alarm, but at a finer >>> granularity. : >>> >>> <alarm/1/state/123> a prov:Entity, ex:AlarmState ; >>> prov:specializationOf <alarm/1> ; >>> ex:currentStatus "active" ; >>> prov:generatedAtTime "2013-10-28T18:00:00Z" >>> prov:invalidatedAtTime "2013-10-28T23:50:00Z" >>> >>> <alarm/1/state/124> a prov:Entity, ex:AlarmState ; >>> prov:specializationOf <alarm/1> ; >>> ex:currentStatus "disabled" ; >>> prov:generatedAtTime "2013-10-28T23:50:00Z" . >>> >>> We might specify a new subclass "ex:AlarmState" that we know 'locks >>> down' the state - this would allow different kind of specialization, >>> in case you also needed a specialization like ex:BrightnessLog. >>> >>> Each state has a different generation and invalidation time, >>> indicating the life span of the state. This is a continuous span, so >>> the alarm state that was disabled last week is different from the >>> disabled alarm state today, because the alarm was active in the mean >>> time. >>> >>> >>> You might want to organize these states in an order so you don't need >>> to compare the start/end timestamps. >>> >>> <alarm/1/state/124> prov:wasRevisionOf <alarm/1/state/123> . >>> >>> >>> So what if we want to describe who disabled it? A simple solution is >>> to now just provide prov:wasAttributedTo at each state: >>> >>> <alarm/1/state/124> prov:wasAttributedTo <customer/5>. >>> >>> >>> So now we know that customer/5 caused the alarm to be disabled somehow >>> (it was not a supervisor at the security company). >>> >>> If you want to detail this more, say to record how the customer did >>> this (e.g. clicking the alarm panel) - then you can introduce an >>> activity to describe the transition: >>> >>> <alarm/1/state/123> prov:wasInvalidatedBy <activities/987> ; >>> <alarm/1/state/124> prov:wasGeneratedBy <activities/987> . >>> >>> <activities/987> a prov:Activity, ex:AlarmPanelAction ; >>> prov:wasAssociatedWith <customer/5> . >>> >>> >>> Now as to get back to the bundles - if you have separate bundles per >>> week for instance of alarm activities, then you could refer back to >>> the original bundle in your specialization as we say in >>> http://www.w3.org/TR/prov-links/ : >>> >>> <alarm/1/state/124> prov:mentionOf <alarm/1> ; >>> prov:asInBundle <http://example.com/alarms> . >>> >>> >>> In a way this is just a more formal way of saying: >>> >>> <alarm/1/state/124> prov:specializationOf <alarm/1> . >>> <alarm/1> prov:has_provenance <http://example.com/alarms> . >>> >>> (using has_provenance from http://www.w3.org/TR/prov-aq/ ): >>> >>> as the mentionOf/asInBundle adds the additional promise that you will >>> find <alarm/1> described as an entity inside that bundle. >>> >>> >>> >>> On 29 October 2013 11:26, Mike Loll <mike.loll@gmail.com> wrote: >>> > Thanks, Stian. >>> > >>> > My understanding is that an entity referenced in a bundle (e.g. via >>> > wasGeneratedBy) must be in the bundle...but I do not wish to duplicate >>> > entity definitions through out my bundles. My entities are long lived >>> > and >>> > will exist in multiple bundles. >>> > >>> > So lets say I have a resource for alarms which contains a list of all >>> > alarms >>> > my company monitors. If I turn off the alarm at /alarm/1, my >>> > understanding >>> > is that in prov a new entity is created for the new state of /alarm/1. >>> > But >>> > in my actual data store, I don't create a new record, I just toggle a >>> > flag. >>> > >>> > So there is a disconnect between how my prov looks and how my data >>> > looks. >>> > This is by design is my understanding. So I would have a new entity in >>> > my >>> > prov for the /alarm/1 in the new state which is a specialization of >>> > /alarm/1, yes? >>> > >>> > Ultimately, I want to display all of the provenance for /alarm/1 so I >>> > can >>> > see its history from creation to invalidation. Am I going about this >>> > the >>> > wrong way? >>> > >>> > >>> > -- >>> > Mike Loll >>> > >>> > >>> > On Mon, Oct 28, 2013 at 9:54 AM, Stian Soiland-Reyes >>> > <soiland-reyes@cs.manchester.ac.uk> wrote: >>> >> >>> >> Hi! >>> >> >>> >> I would say that any resource that contains provenance statements (in >>> >> particular PRO statements) is a prov:Bundle. However that fact might >>> >> not be recorded anywhere, and it would generally only be used as a >>> >> term when you want to describe provenance of provenance records, or if >>> >> you are cataloguing provenance traces. >>> >> >>> >> >>> >> In my application I report the provenance of a scientific workflow run. >>> >> >>> >> When I save this provenance to a file, it includes its own >>> >> meta-provenance so you can tell how and when this file was recorded, >>> >> as it could have been saved from the internal database at an arbitrary >>> >> time after the run. >>> >> >>> >> In RDF this is normally quite easy by simply describing the relative >>> >> URI <> which would mean "this document" - wherever it ends up being >>> >> located: >>> >> >>> >> >>> >> <> a prov:Bundle ; >>> >> foaf:primaryTopic >>> >> >>> >> <http://ns.taverna.org.uk/2011/run/5e93cdba-27ec-4757-addf-fc91be12c7a4/> >>> >> ; >>> >> prov:wasGeneratedBy <#taverna-prov-export> . >>> >> >>> >> <#taverna-prov-export> a prov:Activity ; >>> >> rdfs:label "taverna-prov export of workflow >>> >> run provenance"@en ; >>> >> prov:startedAtTime >>> >> "2013-09-02T15:22:25.961Z"^^xsd:dateTime ; >>> >> prov:endedAtTime >>> >> "2013-09-02T15:22:30.89Z"^^xsd:dateTime ; >>> >> prov:wasInformedBy >>> >> >>> >> <http://ns.taverna.org.uk/2011/run/5e93cdba-27ec-4757-addf-fc91be12c7a4/> >>> >> ; >>> >> prov:wasAssociatedWith <#taverna-engine> ; >>> >> prov:qualifiedAssociation [ a prov:Association ; >>> >> prov:hadPlan >>> >> <http://ns.taverna.org.uk/2011/software/taverna-2.4.0> ; >>> >> prov:agent <#taverna-engine> >>> >> ] . >>> >> >>> >> >>> >> <http://ns.taverna.org.uk/2011/run/5e93cdba-27ec-4757-addf-fc91be12c7a4/> >>> >> a prov:Activity, wfprov:WorkflowRun ; >>> >> rdfs:label "Workflow run of >>> >> GWAS_to_biomedical_c"@en ; >>> >> prov:startedAtTime >>> >> "2013-09-02T17:19:31.676+02:00"^^xsd:dateTime ; >>> >> prov:endedAtTime >>> >> "2013-09-02T17:20:00.662+02:00"^^xsd:dateTime >>> >> . >>> >> >>> >> # .. followed by the actual workflow run provenance with many more >>> >> activities and nested wfprov:WorkflowRuns >>> >> >>> >> >>> >> I am not saying that everyone should include this meta-provenance as >>> >> in many cases it would be self-evident or not relevant - but in my >>> >> case it is important for three reasons. >>> >> >>> >> 1 - I can see the version of the software used to generate the >>> >> provenance (as I am still developing that) >>> >> 2 - I can see when provenance was exported compared to when it was >>> >> run. In this case just 2 minutes after - and hence I can be fairly >>> >> certain about statements the provenance trace makes about generated >>> >> files etc. >>> >> 3 - I use foaf:primaryTopic (my own convention - which makes <> also >>> >> be a foaf:Document) to find the top-level "starting point" of the >>> >> provenance. (this is also indicated with the slightly weaker relation >>> >> prov:wasInformedBy) >>> >> >>> >> >>> >> >>> >> On 28 October 2013 11:26, Mike Loll <mike.loll@gmail.com> wrote: >>> >> > I'm having some difficulty wrapping my head around when bundles would >>> >> > be >>> >> > used. Is it so we can describe how a set of provenance records came >>> >> > to >>> >> > be >>> >> > (the provenance of the provenance)? >>> >> > >>> >> > I'm having a little difficulty wrapping my head around the use cases. >>> >> > >>> >> > Example 40 from >>> >> > http://www.w3.org/TR/2013/REC-prov-dm-20130430/#component4 >>> >> > shows two reports (r1, r2) being generated with r2 derived from r1. >>> >> > It >>> >> > then >>> >> > describes a bundle describing that "Bob" witnessed r1 being >>> >> > generated. >>> >> > The >>> >> > example goes on to show a bundle for "Alice" observing the generation >>> >> > of >>> >> > r2. >>> >> > >>> >> > How is this useful? I think my real question is shouldn't all >>> >> > provenance >>> >> > events be contained in a bundle? >>> >> > >>> >> > Any insight is appreciated. >>> >> > >>> >> > I'm working on a clojure implementation of the provenance model as an >>> >> > exercise and I want to be sure I have my understanding set. >>> >> > >>> >> > Thanks. >>> >> > >>> >> > >>> >> > -- >>> >> > Mike Loll >>> >> >>> >> >>> >> >>> >> -- >>> >> Stian Soiland-Reyes, myGrid team >>> >> School of Computer Science >>> >> The University of Manchester >>> >> http://soiland-reyes.com/stian/work/ >>> >> http://orcid.org/0000-0001-9842-9718 >>> > >>> > >>> >>> >>> >>> -- >>> Stian Soiland-Reyes, myGrid team >>> School of Computer Science >>> The University of Manchester >>> http://soiland-reyes.com/stian/work/ http://orcid.org/0000-0001-9842-9718 >> >> > > > > -- > Stian Soiland-Reyes, myGrid team > School of Computer Science > The University of Manchester > http://soiland-reyes.com/stian/work/ http://orcid.org/0000-0001-9842-9718 -- Stian Soiland-Reyes, myGrid team School of Computer Science The University of Manchester http://soiland-reyes.com/stian/work/ http://orcid.org/0000-0001-9842-9718
Received on Tuesday, 29 October 2013 15:23:09 UTC