Re: Bundles explained from Mike Loll on 2013-10-29 (public-prov-comments@w3.org from October 2013)

From: Mike Loll <mike.loll@gmail.com>
Date: Tue, 29 Oct 2013 08:12:29 -0400
To: Daniel Garijo <dgarijo@delicias.dia.fi.upm.es>
Cc: "public-prov-comments@w3.org" <public-prov-comments@w3.org>
Message-ID: <CAHHe3w3b=2mjor5xhkpfrZHcNKVQcP7ThGsvs6p8R69KJ9S22A@mail.gmail.com>
Thanks for the great example, Daniel.  This is what I want to do.  I must
admit prov-links is one of the specs I have not read too much into yet.
I'll be taking a look today.

So if I switched the alarm on->off->on would that result in 3 entities (I
think so).

:alarm/1/on1
:alarm/1/off1
:alarm/1/on2

Using some sort of thing to make the names standardized so I can easily
find them (e.g. on1, on2, etc).

I'm trying to make sure I am doing provenance "the correct way" but I think
I need to keep in mind, as you wrote, that I can go as complex or simple as
I like.


--
Mike Loll


On Tue, Oct 29, 2013 at 7:44 AM, Daniel Garijo <
dgarijo@delicias.dia.fi.upm.es> wrote:

> Hi Mike,
> what you describe sound like a case for specialization (see
> http://www.w3.org/TR/prov-o/#specializationOf).
> You have a general entity (/alarm/1) which you want to contextualize
> during different periods of time
> (/alarm/1/when_I_switched_off, /alarm/1/when_something_happened, etc.).
> When the state of the alarm changes, a new entity is produced (you were
> right there). But that entity is a specialization of the
> general entity (/alarm/1), which is derived from a previous state of the
> entity. Therefore your data is not disconnected from the provenance.
>
> Your example (encoded in rdf):
> :alarm/1 a :alarm; //general entity
>
> :alarm/1/on a :alarm;
>         prov:specializationOf :alarm/1.
>
> :switch_Off a prov:Activity;
>                  prov:used :alarm/1/on;
>                  prov:generated :alarm/1/off;
>                  prov:wasAssociatedWith :Mike.
>
> :alarm/1/on a :alarm;
>                  prov:wasGeneratedBy :switch_Off;
>                  prov:specializationOf :alarm/1;
>                  prov:wasDerivedFrom :alarm/1/on.
>
> Now you can ask for the provenance of the alarm through its
> specializations and you can ask for each of the ste states of the alarm and
> how it has been produced. Of course, all this depends on the granularity
> you want to record in your system. It can be simplified or made more
> complex, that is up to you. If you want to know more about contextualized
> entities in bundles (which I think it is not your case) then I recommend
> you to have a look at this document:
> http://www.w3.org/TR/2013/NOTE-prov-links-20130430/
>
> Best,
> Daniel
>
>
> 2013/10/29 Mike Loll <mike.loll@gmail.com>
>
>> Thanks, Stian.
>>
>> My understanding is that an entity referenced in a bundle (e.g. via
>> wasGeneratedBy) must be in the bundle...but I do not wish to duplicate
>> entity definitions through out my bundles.  My entities are long lived and
>> will exist in multiple bundles.
>>
>> So lets say I have a resource for alarms which contains a list of all
>> alarms my company monitors.  If I turn off the alarm at /alarm/1, my
>> understanding is that in prov a new entity is created for the new state of
>> /alarm/1.  But in my actual data store, I don't create a new record, I just
>> toggle a flag.
>>
>> So there is a disconnect between how my prov looks and how my data
>> looks.  This is by design is my understanding.  So I would have a new
>> entity in my prov for the /alarm/1 in the new state which is a
>> specialization of /alarm/1, yes?
>>
>> Ultimately, I want to display all of the provenance for /alarm/1 so I can
>> see its history from creation to invalidation.  Am I going about this the
>> wrong way?
>>
>>
>> --
>> Mike Loll
>>
>>
>> On Mon, Oct 28, 2013 at 9:54 AM, Stian Soiland-Reyes <
>> soiland-reyes@cs.manchester.ac.uk> wrote:
>>
>>> Hi!
>>>
>>> I would say that any resource that contains provenance statements (in
>>> particular PRO statements) is a prov:Bundle. However that fact might
>>> not be recorded anywhere, and it would generally only be used as a
>>> term when you want to describe provenance of provenance records, or if
>>> you are cataloguing provenance traces.
>>>
>>>
>>> In my application I report the provenance of a scientific workflow run.
>>>
>>> When I save this provenance to a file, it includes its own
>>> meta-provenance so you can tell how and when this file was recorded,
>>> as it could have been saved from the internal database at an arbitrary
>>> time after the run.
>>>
>>> In RDF this is normally quite easy by simply describing the relative
>>> URI <> which would mean "this document" - wherever it ends up being
>>> located:
>>>
>>>
>>> <> a prov:Bundle ;
>>>       foaf:primaryTopic
>>> <http://ns.taverna.org.uk/2011/run/5e93cdba-27ec-4757-addf-fc91be12c7a4/
>>> >
>>> ;
>>>       prov:wasGeneratedBy       <#taverna-prov-export> .
>>>
>>> <#taverna-prov-export> a prov:Activity ;
>>>        rdfs:label                   "taverna-prov export of workflow
>>> run provenance"@en ;
>>>         prov:startedAtTime
>>> "2013-09-02T15:22:25.961Z"^^xsd:dateTime ;
>>>         prov:endedAtTime
>>> "2013-09-02T15:22:30.89Z"^^xsd:dateTime ;
>>>         prov:wasInformedBy
>>> <http://ns.taverna.org.uk/2011/run/5e93cdba-27ec-4757-addf-fc91be12c7a4/
>>> >
>>> ;
>>>         prov:wasAssociatedWith       <#taverna-engine> ;
>>>         prov:qualifiedAssociation    [ a prov:Association ;
>>>             prov:hadPlan
>>> <http://ns.taverna.org.uk/2011/software/taverna-2.4.0> ;
>>>             prov:agent    <#taverna-engine>
>>>         ] .
>>>
>>> <http://ns.taverna.org.uk/2011/run/5e93cdba-27ec-4757-addf-fc91be12c7a4/
>>> >
>>> a prov:Activity, wfprov:WorkflowRun ;
>>>    rdfs:label                  "Workflow run of GWAS_to_biomedical_c"@en
>>> ;
>>>    prov:startedAtTime
>>>  "2013-09-02T17:19:31.676+02:00"^^xsd:dateTime ;
>>>    prov:endedAtTime
>>>  "2013-09-02T17:20:00.662+02:00"^^xsd:dateTime .
>>>
>>> # .. followed by the actual workflow run provenance with many more
>>> activities and nested wfprov:WorkflowRuns
>>>
>>>
>>> I am not saying that everyone should include this meta-provenance as
>>> in many cases it would be self-evident or not relevant - but in my
>>> case it is important for three reasons.
>>>
>>> 1 - I can see the version of the software used to generate the
>>> provenance (as I am still developing that)
>>> 2 - I can see when provenance was exported compared to when it was
>>> run. In this case just 2 minutes after - and hence I can be fairly
>>> certain about statements the provenance trace makes about generated
>>> files etc.
>>> 3 - I use foaf:primaryTopic (my own convention - which makes <> also
>>> be a foaf:Document) to find the top-level "starting point" of the
>>> provenance. (this is also indicated with the slightly weaker relation
>>> prov:wasInformedBy)
>>>
>>>
>>>
>>> On 28 October 2013 11:26, Mike Loll <mike.loll@gmail.com> wrote:
>>> > I'm having some difficulty wrapping my head around when bundles would
>>> be
>>> > used.  Is it so we can describe how a set of provenance records came
>>> to be
>>> > (the provenance of the provenance)?
>>> >
>>> > I'm having a little difficulty wrapping my head around the use cases.
>>> >
>>> > Example 40 from
>>> http://www.w3.org/TR/2013/REC-prov-dm-20130430/#component4
>>> > shows two reports (r1, r2) being generated with r2 derived from r1.
>>>  It then
>>> > describes a bundle describing that "Bob" witnessed r1 being generated.
>>>  The
>>> > example goes on to show a bundle for "Alice" observing the generation
>>> of r2.
>>> >
>>> > How is this useful?  I think my real question is shouldn't all
>>> provenance
>>> > events be contained in a bundle?
>>> >
>>> > Any insight is appreciated.
>>> >
>>> > I'm working on a clojure implementation of the provenance model as an
>>> > exercise and I want to be sure I have my understanding set.
>>> >
>>> > Thanks.
>>> >
>>> >
>>> > --
>>> > Mike Loll
>>>
>>>
>>>
>>> --
>>> Stian Soiland-Reyes, myGrid team
>>> School of Computer Science
>>> The University of Manchester
>>> http://soiland-reyes.com/stian/work/
>>> http://orcid.org/0000-0001-9842-9718
>>>
>>
>>
>
Received on Tuesday, 29 October 2013 12:13:01 UTC