- From: Jim McCusker <mccusj@rpi.edu>
- Date: Wed, 9 May 2012 19:35:13 -0400
- To: Daniel Garijo <dgarijo@delicias.dia.fi.upm.es>
- Cc: Paolo Missier <Paolo.Missier@ncl.ac.uk>, Paolo Missier <paolo.missier@newcastle.ac.uk>, Stephan Zednik <zednis@rpi.edu>, Davide Ceolin <davide.ceolin@gmail.com>, "public-prov-comments@w3.org" <public-prov-comments@w3.org>
- Message-ID: <CAAtgn=RpAfWykRHczuxZZgM-1gOLjU=RgW+vpp+4n9QkpB++cQ@mail.gmail.com>
Yes, granularity is the issue I'm referring to. Jim On Wed, May 9, 2012 at 7:22 PM, Daniel Garijo < dgarijo@delicias.dia.fi.upm.es> wrote: > Hi Paolo, > I think it has to do more with granularity than with process description: > A user A may see the experiment(ex1) as an activity which uses dataset d1 > and produces result r1. > > Another user may want a lower level of granularity, and for him the > experiment ex1 had 2 intermediate steps: > task123 and task124: task123 used d1 and produced r1', while task124 uses > r1' to produce r1. > > So, besides the fact that task123 and task124 can be considered part of > ex1, we have 2 provenance traces > that correspond to 2 different accounts where r1 is produced by 2 > different activities. And that is not currently > supported in DM, because it's functional. Am I wrong? > > Best, > Daniel > > > 2012/5/10 Paolo Missier <Paolo.Missier@ncl.ac.uk> > >> absolutely, but what you are referring to with "steps within an >> experiment" seems to indicate that there is a process description which >> includes structural containment, and my understanding is that by design >> prov does not include process description at all. What I believe you can >> say is that you observed one activity (the "experiment") start another >> ("task123"). Then, you can say that task123 generated entity e1, but no >> relationship between the experiment and e1 would follow. >> So do we need to extend the model to capture process description? >> >> -Paolo >> >> >> >> >> On 5/9/12 11:50 PM, Jim McCusker wrote: >> >> If I have an experiment, and that experiment generates a data file, but >> there were steps within that experiment that actually did the work, I would >> think we should be able to talk about that within an account. >> >> Jim >> >> On Wed, May 9, 2012 at 6:43 PM, Paolo Missier <Paolo.Missier@ncl.ac.uk>wrote: >> >>> May I ask what /is/ activity composition? i.e. what is the semantics of >>> >>> :a2 a prov:Activity; dc:partOf :a1 >>> >>> (the use of dc:partOf seems to confirm that prov does not include such >>> concept). >>> >>> Also, I think what Davide has in mind with >>> >>> " two separate graphs stating that each of the two activities generated >>> the entity" >>> is a form of "bundling", or separate accounts, so the statement >>> >>> >>> :e1 a prov:Entity; prov:wasGeneratedBy :a1, :a2. >>> >>> would not hold within a single account, and thus the >>> generation-uniqueness rule does not apply? >>> >>> -Paolo >>> >>> >>> >>> >>> On 5/9/12 11:06 PM, Stephan Zednik wrote: >>> >>> Perhaps wasGeneratedBy should not be functional? >>> >>> I think supporting activity composition will be heavily requested by >>> the provenance community. I know projects at RPI/HAO that I am a part of >>> and provenance projects at CSIRO have recognized it as an important >>> (potentially critical) aspect in generating provenance >>> presentations/visualizations for end users. >>> >>> Perhaps if a :a2 generated an entity :e2 that was a specialization of >>> :e1? >>> >>> We ~should~ be able to record provenance at different, and logically >>> connected, levels of abstraction, and activity composition seems a natural >>> component for doing so. >>> >>> --Stephan >>> >>> On May 9, 2012, at 3:56 PM, Jim McCusker wrote: >>> >>> There are some problems here with composition though, specifically when >>> you try to say something like this: >>> >>> :a1 a prov:Activity. >>> :a2 a prov:Activity; dc:partOf :a1. >>> >>> :e1 a prov:Entity; prov:wasGeneratedBy :a1, :a2. >>> >>> Basically, since :a2 is part of :a1, and :a2 served as a "final >>> activity" (there aren't any further activities that used :e1), :e1, by >>> virtue of being generated by :a2 was also generated by :a1. But since >>> wasGeneratedBy is functional, we cannot assert that without :a1 and :a2 >>> becoming identical (sameAs). >>> >>> Jim >>> >>> On Wed, May 9, 2012 at 5:47 PM, Paolo Ncl <Paolo.Missier@ncl.ac.uk>wrote: >>> >>>> Davide >>>> >>>> I guess it depends on how you define "part of" in this setting. You can >>>> specify that an activity has started another, which makes, informally, the >>>> former a "parent" of the latter. You can use this to model forking, for >>>> example. This is about the observed behavior of a process and is within >>>> scope. But there is no way to express structural containment, or >>>> composition, because describing process models and structure (for instance, >>>> the structure of a program, a workflow, a script etc.) is not within the >>>> PROV scope. >>>> I hope others in the group concur with this interpretation >>>> >>>> Regards, >>>> >>>> P.Missier - paolo.missier@ncl.ac.uk >>>> >>>> On 7 May 2012, at 21:44, Davide Ceolin <davide.ceolin@gmail.com> wrote: >>>> >>>> > Hello, >>>> > >>>> > I am a PhD student of the VU University Amsterdam, and I would have a >>>> question about the composition of activities in PROV. I noticed that it is >>>> not possible to explicitly state that an activity is actually part of >>>> another one. >>>> > >>>> > Suppose that a given entity is the result of an activity and, in >>>> turn, this activity is part of a larger one. >>>> > >>>> > I can represent this scenario with two separate graphs stating that >>>> each of the two activities generated the entity, and from them (and their >>>> execution times, etc.) I may infer that one is part of the other one, but I >>>> can't explicitly state that. >>>> > >>>> > Is there a specific reason for such a limitation? >>>> > >>>> > Thanks, >>>> > >>>> > Davide >>>> > >>>> > Davide Ceolin MSc. >>>> > PhD student >>>> > The Network Institute >>>> > VU University Amsterdam >>>> > d.ceolin@vu.nl >>>> > http://www.few.vu.nl/~dceolin/ >>>> > >>>> > >>>> > >>>> >>>> >>> >>> >>> -- >>> Jim McCusker >>> Programmer Analyst >>> Krauthammer Lab, Pathology Informatics >>> Yale School of Medicine >>> james.mccusker@yale.edu | (203) 785-6330 <%28203%29%20785-6330> >>> http://krauthammerlab.med.yale.edu >>> >>> PhD Student >>> Tetherless World Constellation >>> Rensselaer Polytechnic Institute >>> mccusj@cs.rpi.edu >>> http://tw.rpi.edu >>> >>> >>> >>> >>> -- >>> ----------- ~oo~ -------------- >>> Paolo Missier - Paolo.Missier@newcastle.ac.uk, pmissier@acm.org >>> School of Computing Science, Newcastle University, UKhttp://www.cs.ncl.ac.uk/people/Paolo.Missier >>> >>> >> >> >> -- >> Jim McCusker >> Programmer Analyst >> Krauthammer Lab, Pathology Informatics >> Yale School of Medicine >> james.mccusker@yale.edu | (203) 785-6330 >> http://krauthammerlab.med.yale.edu >> >> PhD Student >> Tetherless World Constellation >> Rensselaer Polytechnic Institute >> mccusj@cs.rpi.edu >> http://tw.rpi.edu >> >> >> >> -- >> ----------- ~oo~ -------------- >> Paolo Missier - Paolo.Missier@newcastle.ac.uk, pmissier@acm.org >> School of Computing Science, Newcastle University, UKhttp://www.cs.ncl.ac.uk/people/Paolo.Missier >> >> > -- Jim McCusker Programmer Analyst Krauthammer Lab, Pathology Informatics Yale School of Medicine james.mccusker@yale.edu | (203) 785-6330 http://krauthammerlab.med.yale.edu PhD Student Tetherless World Constellation Rensselaer Polytechnic Institute mccusj@cs.rpi.edu http://tw.rpi.edu
Received on Wednesday, 9 May 2012 23:36:05 UTC