- From: Daniel Garijo <dgarijo@delicias.dia.fi.upm.es>
- Date: Thu, 25 Aug 2011 17:48:27 +0200
- To: Graham Klyne <GK@ninebynine.org>
- Cc: "Myers, Jim" <MYERSJ4@rpi.edu>, Satya Sahoo <satya.sahoo@case.edu>, "Deus, Helena" <helena.deus@deri.org>, Khalid Belhajjame <Khalid.Belhajjame@cs.man.ac.uk>, "public-prov-wg@w3.org" <public-prov-wg@w3.org>
- Message-ID: <CAExK0DfL-fM30VYnuKM0CjQPhmUc0Q_yiTrtPwKSQZEepyVx9A@mail.gmail.com>
Graham, You are right, "force" might have not been the best word to express what I wanted to mean. What I was trying to say is that some publishers might not be even aware of this extra levels of provenance. By defining them in the formal model, oublishers may realize that they could add it easily to what they are publishing. But it is always an optional thing to do, of course. Best, Daniel 2011/8/25 Graham Klyne <GK@ninebynine.org> > On 21/08/2011 04:42, Daniel Garijo wrote: > >> Yes, it makes a lot of sense. Thanks for the diagram, it really helps a >> lot. >> I see that the notion of container can be implicit in the relationship >> "provenanceOf". >> In case I wanted to distinghish between diffenrent types of containers >> (e.g., the level >> of metadata provenance they are recording) I could allways extend the >> relationship >> with subrelationships depending on my domain, so I like your solution. >> > > Daniel, > > It turns out that in replying to Jim just now, I realize there *MAY* be a > need for some notion of containment when dealing with accounts. But I'd > still like to be able to deal with bare provenance without having to tangle > with arcana like accounts and their accoutrements. (Not saying that > accounts don't have uses, just that as a developer I don't want to be forced > into considering them or their structures when recording and exchanging > "raw" provenance information.) > > > However, I understood the container also as a way to ""force"" provenance >> publishers >> to group their provenance descriptions in order to have them more >> structured. If >> we decide not to add this class, at least we should add a note in the >> reccommendations >> or guidelines about provenance publication. If we avoid mentioning this >> issue, >> I think that most provenance publishers will not group their descriptions, >> just >> because they won't realize that they should do it. >> > > I don't think its appropriate for the spec to "force" publishers to do > anything that isn't strictly needed for some useful level of > interoperability. The general thrust of many of my comments is try try and > identify the minimum that needs to be asserted to get provenance off the > ground as a form of exchangeable information. > > > I haven't seen any mention to a provenanceOf relatioship in the model. >> Should we >> raise an issue? Or do you want to leave it out of scope too? (I would not >> agree on >> this last point) >> > > I think something like that will turn out to be necessary when we consider > its representation in RDF; whether it's needed in the abstract model is > less clear to me. > > #g > -- > > 2011/8/19 Graham Klyne<GK@ninebynine.org> >> >> Daniel, >>> >>> Thanks for your constructive response: it's very helpful here to have a >>> more concrete example to talk around. I've delayed replying until I have >>> time to properly understand, think through and respond to you scenario. >>> >>> >>> Daniel Garijo wrote: >>> >>>> Going back to the provenance container, I'll put an example which I >>>> think it shows its usefulness. >>>> If you can address it without the provenance containers, then I'll be >>>> happy to know how :) >>>> The example: A project P wants to publish provenance information and >>>> descriptions about the resources available >>>> in various museums M1...Mn. Now imagine that for some resource R many >>>> museums have provenance records >>>> that may be conflictive. The descriptive statements of the resource are >>>> separated in one named graph and the >>>> provenance statements in another (provenance container). This way I can >>>> query the provenance containers from >>>> all the museums about R without getting the descriptions, and filter >>>> them by creator, in the case I'm only interested >>>> for the records of the entities I trust more. >>>> >>> >>> I assume here that when you say "I can query the provenance containers >>> from >>> all the museums about R without getting the descriptions", you are asking >>> about who has created the provenance information, rather than the >>> provenance >>> information itself? >>> >>> If so, the answer is easy: the provenance resources themselves may have >>> provenance information, which are separate resources. >>> >>> E.g. >>> >>> (I've attached a quick diagram sketch to this email as a PNG file. It >>> can >>> also be viewed at: http://dvcs.w3.org/hg/prov/**** >>> raw-file/09ac58672f56/**<http://dvcs.w3.org/hg/prov/**raw-file/09ac58672f56/**> >>> diagrams/Provenance-of-****provenance.png<http://dvcs.w3.** >>> org/hg/prov/raw-file/**09ac58672f56/diagrams/** >>> Provenance-of-provenance.png<http://dvcs.w3.org/hg/prov/raw-file/09ac58672f56/diagrams/Provenance-of-provenance.png> >>> >. >>> >>> Omnigraffle source is also in the WG repository. These may be re-used or >>> adapted by any or all as they see fit.) >>> >>> In this diagram, the provenance information at P1P would contain a >>> provenance statement to the effect that P1 is created by M1. Similarly >>> for >>> P2P. The provenance resources P1P and P2P could be located given P1 and >>> P2 >>> using the same PAQ mechanisms that might be used to locate P1 and P2 >>> given >>> R. >>> >>> None of this precludes using named graphs or other provenance containers >>> for your implementation. By simply treating provenance information as >>> another resource, we don't need to make containers a separate concept. >>> Thus, I submit that while the existence of containment is not >>> prohibited, >>> it is outside the scope of this group to specify as a requirement or >>> mechanism in a provenance standard. Another way of looking at this might >>> be >>> to note that any resource might be a container for other information, and >>> no >>> more needs to be said about containment. >>> >>> To address my other point about mixing provenance and non-provenance: a >>> use-case I came across just yesterday was a desire to combine provenance >>> information from a workflow execution together with information about >>> resource usage during said execution, to be used in assessing if a >>> workflow >>> might be moved to a different execution environment. If a provenance >>> container is somehow restricted to containing just provenance >>> information, >>> this would not be so easy. >>> >>> But I do acknowledge that (using RDF) is *is* necessary to separate the >>> provenance-of-provenance from the provenance (and other description) >>> about a >>> resource. >>> >>> Is this making any sense to you? >>> >>> #g >>> -- >>> >>> >>> Daniel Garijo wrote: >>> >>> Hi Graham, >>>> >>>> I still disagree. Could you please explain to me what are the benefits >>>> of >>>> mixing provenance >>>> statements with non provenance statements in the resources? Isn't that >>>> what we already have >>>> in most applications that record at least a bit of provenance? >>>> >>>> From my point of view, if I ask for the provenance of a resource I want >>>> to obtain the provenance >>>> of the resource, not non provenance statements about the resource. But I >>>> may be getting the >>>> idea wrong (I haven't had time to go through the PAQ document yet). >>>> >>>> Going back to the provenance container, I'll put an example which I >>>> think >>>> it shows its usefulness. >>>> If you can address it without the provenance containers, then I'll be >>>> happy to know how :) >>>> The example: A project P wants to publish provenance information and >>>> descriptions about the resources available >>>> in various museums M1...Mn. Now imagine that for some resource R many >>>> museums have provenance records >>>> that may be conflictive. The descriptive statements of the resource are >>>> separated in one named graph and the >>>> provenance statements in another (provenance container). This way I can >>>> query the provenance containers from >>>> all the museums about R without getting the descriptions, and filter >>>> them >>>> by creator, in the case I'm only interested >>>> for the records of the entities I trust more. >>>> >>>> I think this separation is better than just having everything mixed up >>>> in >>>> a named graph with all the statements about >>>> the resource, but I'm open to discuss alternatives. >>>> >>>> Best, >>>> Daniel >>>> >>>> 2011/8/17 Graham Klyne<GK@ninebynine.org<**mailto:GK@ninebynine.org>> >>>> >>>> >>>> Daniel, >>>> >>>> I'm sorry, I don't see the need here. If provenance information is >>>> presented as web resources (which I think it should be), then >>>> assertions can be made about those resources which in turn can be >>>> the basis for the kinds of filtering you describe. >>>> >>>> Whether those provenance resources are presented as separate RDF >>>> documents or named graphs within a single RDF document is, to my >>>> mind, an implementation choice, nothing more. >>>> >>>> I can't see why any more (i.e. notion of container) is needed here >>>> in the model. >>>> >>>> #g >>>> -- >>>> >>>> BTW, a an implementation choice, I fully expect provenance resources >>>> may also contain non-provenance information, so the issue of >>>> separating provenance from non-provenance can't be entirely >>>> addressed at the level of "containers" anyway. >>>> >>>> If the provenance model *prevents* me from mixing provenance RDF >>>> with other RDF, then I think it fails in a very fundamental way. >>>> Thus any notion of container must also work for non-provenance RDF, >>>> so what would be needed is not a provenance-specific notion, but one >>>> that works for all of RDF. And that, I submit, is out of scope for >>>> this group. >>>> >>>> >>>> >>>> Daniel Garijo wrote: >>>> >>>> Hi Graham, >>>> >>>> it is true, they can be treated as independent resources if they >>>> are named graphs. >>>> However there can be other named graphs in a triple store along >>>> with the provenance >>>> containers. If we retrieve all the graphs that have triples >>>> talking about the resource >>>> we might obtain named graphs which won't contain provenance >>>> about the resource, >>>> but if we have asserted which ones are provenance containers, we >>>> can filter them >>>> easily. >>>> >>>> So, in summary, I think they are important to distinguish the >>>> normal descriptions >>>> from the metadata provenance in the triple store. >>>> >>>> Best, >>>> Daniel >>>> >>>> 2011/8/16 Graham Klyne<GK@ninebynine.org >>>> <mailto:GK@ninebynine.org> <mailto:GK@ninebynine.org >>>> <mailto:GK@ninebynine.org>>> >>>> >>>> >>>> Daniel, >>>> >>>> Why the need? If named graphs are used, they have URIs and >>>> can be >>>> treated as independent resources. Whether these are actually >>>> stored >>>> as separate RDF resources, or as named graphs within an RDF >>>> document >>>> is an implementation choice that should not be exposed >>>> through the >>>> abstract provenance model, IMO. >>>> >>>> Not formally recognizing containers (or whatever) in the >>>> model does >>>> not prevent you from creating an application that does what >>>> you >>>> describe. >>>> >>>> I think the crux of our debate is this: is there an >>>> interoperability requirement that cannot be satisfied without >>>> explicitly recognizing containers in the model? If there is a >>>> compelling such requirement, I'll withdraw my objection. >>>> >>>> #g >>>> -- >>>> >>>> >>>> Daniel Garijo wrote: >>>> >>>> Yes, I was thinking about named graphs for grouping the >>>> provenance descriptions. However, I do think >>>> that the model should recognize explicitly the "provenance >>>> container" (or whatever we decide to name it in >>>> the end), so I could select the provenance containers >>>> having >>>> statements referring to a resource and filter them >>>> depending on a certain constraint (like author or date of >>>> creation). >>>> >>>> Best, >>>> Daniel >>>> >>>> 2011/8/15 Graham Klyne<GK@ninebynine.org >>>> <mailto:GK@ninebynine.org> >>>> <mailto:GK@ninebynine.org<**mailto:GK@ninebynine.org>> >>>> <mailto:GK@ninebynine.org<**mailto:GK@ninebynine.org> >>>> >>>> <mailto:GK@ninebynine.org<**mailto:GK@ninebynine.org>>>> >>>> >>>> >>>> Daniel, >>>> >>>> It sounds to me as if you're trying to subdivide web >>>> resources, and >>>> that seems to me like a potential lot of complexity for >>>> questionable >>>> gain. >>>> >>>> (If you're thinking of something like named graphs in >>>> an RDF >>>> document, then fine: here each of the graphs has its >>>> own URI, so >>>> for descriptive purposes can be treated as a separate >>>> web >>>> resource. >>>> I don't think this is something the model needs to >>>> explicitly >>>> recognize, as it amounts to an implementation detail.) >>>> >>>> #g >>>> -- >>>> >>>> Daniel Garijo wrote: >>>> >>>> Hi Graham, >>>> I like Provenance Container. What if your >>>> provenance >>>> statements >>>> were created by different persons, >>>> processes or at different times, but they are >>>> within the same >>>> Provenance Document >>>> (since they are provenance assertions about the >>>> same >>>> entity)? I >>>> may want to describe the different >>>> provenance containers, or even the provenance >>>> container >>>> descriptions with another one. >>>> >>>> Thanks, >>>> Daniel >>>> >>>> 2011/8/15 Graham Klyne<GK@ninebynine.org >>>> <mailto:GK@ninebynine.org> >>>> <mailto:GK@ninebynine.org<**mailto:GK@ninebynine.org>> >>>> <mailto:GK@ninebynine.org >>>> <mailto:GK@ninebynine.org> <mailto:GK@ninebynine.org >>>> <mailto:GK@ninebynine.org>>> >>>> <mailto:GK@ninebynine.org<**mailto:GK@ninebynine.org> >>>> <mailto:GK@ninebynine.org<**mailto:GK@ninebynine.org>> >>>> >>>> <mailto:GK@ninebynine.org >>>> <mailto:GK@ninebynine.org> <mailto:GK@ninebynine.org >>>> <mailto:GK@ninebynine.org>>>>> >>>> >>>> >>>> >>>> Jim, >>>> >>>> FWIW, in PAQ we talk about "provenance >>>> information" as >>>> just >>>> another >>>> resource that includes provenance assertions. >>>> To my >>>> mind, it's >>>> primary representation would be as an RDF >>>> document. >>>> >>>> The terminology here is subject to review and >>>> harmonization >>>> with the >>>> model, but I'm not convinced that we need a new >>>> concept in >>>> the model >>>> for this, and I'm not keen on a name involving >>>> "container", >>>> as in my >>>> mind that sets up expectations of a distinct >>>> layer of >>>> encapsulation. >>>> We don't talk about "containers" for HTML or >>>> XML >>>> elements, >>>> we just >>>> talk about HTML and XML documents. Same for >>>> provenance, IMO. >>>> >>>> I suppose that suggests "Provenance Document", >>>> or similar. >>>> >>>> #g >>>> -- >>>> >>>> Myers, Jim wrote: >>>> >>>> >>>> >>>> A couple quick comments: I don’t think we’ve >>>> distinguished >>>> provenance container and account at this >>>> point – >>>> they are an >>>> entity which contains provenance statements >>>> and >>>> are used to >>>> enable you to talk about how the provenance >>>> was >>>> created (what >>>> processes and inputs caused those statements >>>> to >>>> be), but >>>> collection has been discussed as a general >>>> aggregate >>>> entity/container – a bag of marbles is an >>>> entity >>>> and saying a >>>> process execution used it is shorthand for >>>> talking >>>> about the >>>> individual marbles. A file is a collection >>>> of >>>> bytes and a >>>> process execution may only use some of the >>>> bytes, etc. >>>> >>>> Re: roles – I would argue that you >>>> should use >>>> something quite >>>> specific for the role of your temperature >>>> parameter, e.g. >>>> “processingtempraturesetpoint’ rather than >>>> a generic >>>> “input” or >>>> “inputParameter” role (parameter might >>>> still be a >>>> supertype of >>>> processingtemperaturesetpoint) . This would >>>> be >>>> necessary >>>> if, for >>>> example, your process execution had a >>>> reaction >>>> temperature and a >>>> storage temperature as inputs – now you >>>> have two >>>> numbers/two >>>> temperatures and you have to use each in the >>>> correct role for >>>> the provenance to be correct. In many >>>> cases, you could >>>> potentially describe the type of the entity >>>> itself >>>> well >>>> enough >>>> to make the provenance clear, but putting >>>> the >>>> information >>>> into >>>> the entity typing rather than into the role >>>> it has >>>> relative to >>>> the process execution causes trouble if you >>>> use >>>> the entity in >>>> multiple processes (if I make an entity >>>> that is of >>>> type “ >>>> processingtemperaturesetpoint” and I have a >>>> second >>>> process that >>>> displays a “printablenumber” that uses it as >>>> input, the same >>>> entity can’t also be of type “printable >>>> number” – >>>> better >>>> to make >>>> the entity have type number and play a ‘ >>>> processingtemperaturesetpoint” role in one >>>> process >>>> and the >>>> “printablenumber” role in the other.) >>>> >>>> Jim >>>> >>>> *From:* >>>> public-prov-wg-request@w3.org<**mailto:public-prov-wg-** >>>> request@w3.org<public-prov-wg-**request@w3.org<public-prov-wg-request@w3.org> >>>> >> >>>> >>>> <mailto:public-prov-wg- request@w3.org >>>> <mailto:public-prov-wg-**reque**st@w3.org <request@w3.org>< >>>> public-prov-wg-**request@w3.org <public-prov-wg-request@w3.org>> >>>> >>>> >>>>>> <mailto:public-prov-wg-<**mailto: >>>> public-prov-wg-> >>>> request@w3.org<mailto:request@**w3.org <request@w3.org>> >>>> <mailto:public-prov-wg- request@w3.org >>>> <mailto:public-prov-wg-**reque**st@w3.org <request@w3.org>< >>>> public-prov-wg-**request@w3.org <public-prov-wg-request@w3.org>> >>>> >>>> >>>>>>> <mailto:public-prov-wg- >>>> <mailto:public-prov-wg-> <mailto:public-prov-wg- >>>> <mailto:public-prov-wg->> >>>> request@w3.org<mailto:request@**w3.org <request@w3.org>> >>>> <mailto:request@w3.org<mailto:**request@w3.org <request@w3.org> >>>> >> >>>> <mailto:public-prov-wg-<**mailto:public-prov-wg-> >>>> request@w3.org<mailto:request@**w3.org <request@w3.org>> >>>> <mailto:public-prov-wg- request@w3.org >>>> <mailto:public-prov-wg-**reque**st@w3.org <request@w3.org>< >>>> public-prov-wg-**request@w3.org <public-prov-wg-request@w3.org>> >>>> >>>> >>>>>>>> [mailto:public-prov-wg-<**mailto:public-prov-wg-> >>>> <mailto:public-prov-wg-<**mailto:public-prov-wg->> >>>> <mailto:public-prov-wg-<**mailto:public-prov-wg-> >>>> <mailto:public-prov-wg-<**mailto:public-prov-wg->>> >>>> request@w3.org<mailto:request@**w3.org<request@w3.org> >>>> > >>>> <mailto:request@w3.org<mailto:**request@w3.org <request@w3.org> >>>> >> >>>> <mailto:request@w3.org<mailto:**request@w3.org<request@w3.org> >>>> > >>>> >>>> <mailto:request@w3.org<mailto:**request@w3.org <request@w3.org> >>>> >>> >>>> >>>> <mailto:public-prov-wg-<**mailto:public-prov-wg-> >>>> <mailto:public-prov-wg-<**mailto:public-prov-wg->> >>>> >>>> request@w3.org<mailto:request@**w3.org <request@w3.org>> >>>> <mailto:request@w3.org<mailto:**request@w3.org <request@w3.org> >>>> >> >>>> <mailto:public-prov-wg-<**mailto:public-prov-wg-> >>>> request@w3.org<mailto:request@**w3.org <request@w3.org>> >>>> <mailto:public-prov-wg- request@w3.org >>>> <mailto:public-prov-wg-**reque**st@w3.org <request@w3.org>< >>>> public-prov-wg-**request@w3.org <public-prov-wg-request@w3.org>>>>>>] >>>> >>>> *On >>>> >>>> Behalf Of *Satya Sahoo >>>> *Sent:* Monday, August 15, 2011 11:02 AM >>>> *To:* Deus, Helena >>>> *Cc:* Khalid Belhajjame; >>>> public-prov-wg@w3.org<mailto:p**ublic-prov-wg@w3.org<public-prov-wg@w3.org> >>>> > >>>> <mailto:public-prov-wg@w3.org<**mailto: >>>> public-prov-wg@w3.org >>>> >>>>> **> >>>>> >>>> <mailto:public-prov-wg@w3.org >>>> <mailto:public-prov-wg@w3.org> <mailto:public-prov-wg@w3.org >>>> <mailto:public-prov-wg@w3.org>****> > >>>> <mailto:public-prov-wg@w3.org >>>> <mailto:public-prov-wg@w3.org> >>>> <mailto:public-prov-wg@w3.org >>>> <mailto:public-prov-wg@w3.org>****> <mailto: >>>> public-prov-wg@w3.org >>>> <mailto:public-prov-wg@w3.org> >>>> <mailto:public-prov-wg@w3.org >>>> <mailto:public-prov-wg@w3.org>****> > > >>>> *Subject:* Re: playing with pil ontology >>>> >>>> Hi Lena, >>>> >>>> Thanks again for trying to use the ontology >>>> for the >>>> microarray >>>> use case! >>>> My comments are inline: >>>> >>>> >I am not questioning whether >>>> agent >>>> should be >>>> mapped to agents >>>> defined elsewhere, which seems to>be >>>> obvious– >>>> only wondering >>>> whether agent “label” and “description” are >>>> things >>>> we want to >>>> standardize>in our model or not. We can >>>> “suggest” >>>> rdfs:label >>>> and rdfs:comment without enforcing it as >>>> such >>>> – >>>> >having those >>>> included in the model will likely result in >>>> much less >>>> heterogeneity when it comes to>reporting >>>> provenance >>>> (particularly since we are defining it >>>> necessarily >>>> “open” and >>>> highly granular to fit>any particular >>>> domain. >>>> >>>> I am not sure I understand your >>>> point. The >>>> rdfs:label and >>>> rdfs:comment are two of the nine annotation >>>> properties >>>> that are >>>> part of the OWL2 syntax. So, the provenance >>>> ontology >>>> encoded in >>>> OWL includes them by default. >>>> >>>> > What was its >>>> intended >>>> purpose/role in the description of >>>> provenance? >>>> >>>> Provenance container, account, and >>>> collection >>>> are related >>>> concepts for modeling a collection of >>>> provenance >>>> assertions. >>>> E.g. provenance of a Affymetrix gene chip >>>> will be a >>>> collection >>>> of provenance assertions (date of >>>> manufacture, >>>> location of >>>> manufacturer, production series etc.) that >>>> can be >>>> stored in a >>>> single file and the file will be a >>>> provenance >>>> container. >>>> >>>> Example: a >>>> list of height measurement is an >>>> “untransformed” >>>> entity (a >>>> >>>> dataset); the average of that list>is the >>>> “transformed” >>>> entity >>>> (another dataset, although a very simple >>>> one). >>>> >>>> I am dealing with much more complex >>>> workflows, >>>> (e.g. >>>> files >>>> containing >>>> >>>> the outcome of a microarray>experiment as >>>> the >>>> untransformed >>>> dataset and a list of differentially >>>> expressed >>>> genes as the >>>> >transformed dataset), so please take the >>>> example >>>> above >>>> is just >>>> illustrative. >>>> >>>> I am not sure I see the >>>> granularity/expressivity >>>> issue in the >>>> above example (from your first mail). Both >>>> the >>>> "untransformed" >>>> and "transformed" entities map to input and >>>> output >>>> data of a >>>> process execution - we can create subclass >>>> of >>>> Entity for this >>>> purpose. >>>> >>>> An >>>> investigator (agent) performs an experiment That >>>> experiment has >>>> >>>> several input parameters, some>of which are >>>> entities (e.g. >>>> samples), other are not (e.g. temperature) >>>> Resulting from the >>>> experiment are>several output parameters >>>> (entities) >>>> >>>> I am confused by the above >>>> scenario. Why is >>>> temperature not an >>>> entity? Both the input (sample) and >>>> (temperature) >>>> are special >>>> types (sub class) of entities - (a) >>>> InputData and (b) >>>> InputParameter etc. >>>> >>>> So if I >>>> understand >>>> what you are saying correctly, >>>> “temperature” would >>>> >>>> be an entity of type “input”,>which in >>>> turn would be >>>> subclass >>>> of “role”. An instance of “input” could >>>> then have >>>> a certain >>>> value (e.g.>15C) in one of its properties? >>>> >>>> In that case, does it make sense to >>>> include >>>> “input” and >>>> “output” classes >>>> >>>> in the model as subclasses of>“role”? Or >>>> is this >>>> something that >>>> me and Stephan exemplify in the primer >>>> document under >>>> “usage of >>>> >agent” (or something of the sort)? >>>> >>>> I agree with Khalid's example >>>> where Role >>>> allows >>>> us to model more >>>> complex scenarios. For example, X is an >>>> instance >>>> of class >>>> HumanBeing (perhaps as subclass of entity) >>>> and X >>>> has multiple >>>> roles - researcher, parent, soccer player >>>> etc. To >>>> model these >>>> "functions" we will use the Role class. I >>>> believe >>>> in the >>>> microarray scenario (in your first mail) >>>> Roles are not >>>> needed. >>>> >>>> In that case, >>>> does it >>>> make sense to include “input” and >>>> “output” >>>> >>>> classes in the model as>subclasses of >>>> “role”? Or >>>> is this >>>> something that me and Stephan exemplify in >>>> the primer >>>> >document >>>> under “usage of agent” (or something of the >>>> sort)? >>>> >>>> Sorry I did not understand this. >>>> Role can be >>>> used by any entity, >>>> why only "usage of agent"? >>>> >>>> Thanks. >>>> >>>> Best, >>>> >>>> Satya >>>> >>>> On Mon, Aug 15, 2011 at 7:01 AM, >>>> Deus, Helena >>>> <helena.deus@deri.org >>>> <mailto:helena.deus@deri.org> >>>> <mailto:helena.deus@deri.org >>>> <mailto:helena.deus@deri.org>> <mailto:helena.deus@deri.org >>>> <mailto:helena.deus@deri.org> >>>> <mailto:helena.deus@deri.org<**mailto: >>>> helena.deus@deri.org>> >>>> **> >>>> <mailto:helena.deus@deri.org >>>> <mailto:helena.deus@deri.org> >>>> <mailto:helena.deus@deri.org >>>> <mailto:helena.deus@deri.org>> <mailto:helena.deus@deri.org >>>> <mailto:helena.deus@deri.org> >>>> <mailto:helena.deus@deri.org >>>> <mailto:helena.deus@deri.org>>****> > >>>> <mailto:helena.deus@deri.org >>>> <mailto:helena.deus@deri.org> >>>> <mailto:helena.deus@deri.org<**mailto: >>>> helena.deus@deri.org>> >>>> <mailto:helena.deus@deri.org >>>> <mailto:helena.deus@deri.org> >>>> <mailto:helena.deus@deri.org >>>> <mailto:helena.deus@deri.org>>****> <mailto: >>>> helena.deus@deri.org >>>> <mailto:helena.deus@deri.org> >>>> <mailto:helena.deus@deri.org<**mailto: >>>> helena.deus@deri.org>> >>>> <mailto:helena.deus@deri.org >>>> <mailto:helena.deus@deri.org> >>>> <mailto:helena.deus@deri.org >>>> <mailto:helena.deus@deri.org>>****> > > > wrote: >>>> >>>> Hi Khalid, >>>> >>>> Please see comments inline >>>> >>>> *From:* Khalid Belhajjame >>>> [mailto:Khalid.Belhajjame@cs >>>> <mailto:Khalid.Belhajjame@cs> >>>> <mailto:Khalid.Belhajjame@cs >>>> <mailto:Khalid.Belhajjame@cs>> <mailto:Khalid.Belhajjame@cs >>>> >>>> <mailto:Khalid.Belhajjame@cs> >>>> <mailto:Khalid.Belhajjame@cs >>>> <mailto:Khalid.Belhajjame@cs>>****> . >>>> man.ac.uk<http://man.ac.uk> >>>> <http://man.ac.uk> <http://man.ac.uk> >>>> >>>> >>>> <mailto:Khalid.Belhajjame@cs >>>> <mailto:Khalid.Belhajjame@cs> >>>> <mailto:Khalid.Belhajjame@cs >>>> <mailto:Khalid.Belhajjame@cs>>****. man.ac.uk<http://man.ac.uk> >>>> <http://man.ac.uk> >>>> <mailto:Khalid.Belhajjame@cs >>>> <mailto:Khalid.Belhajjame@cs>. man.ac.uk<http://man.ac.uk> >>>> <mailto:Khalid.Belhajjame@cs. man.ac.uk >>>> <mailto:Khalid.Belhajjame@cs.****man.ac.uk<Khalid.Belhajjame@** >>>> cs.man.ac.uk <Khalid.Belhajjame@cs.man.ac.uk>> >>>> >>>> >>>>>>>> <mailto:Khalid.Belhajjame@cs >>>> <mailto:Khalid.Belhajjame@cs> >>>> <mailto:Khalid.Belhajjame@cs<**mailto: >>>> Khalid.Belhajjame@cs>> >>>> <mailto:Khalid.Belhajjame@cs >>>> <mailto:Khalid.Belhajjame@cs> >>>> <mailto:Khalid.Belhajjame@cs >>>> <mailto:Khalid.Belhajjame@cs>>****> . man.ac.uk< >>>> http://man.ac.uk> >>>> <http://man.ac.uk> >>>> <http://man.ac.uk> >>>> >>>> <mailto:Khalid.Belhajjame@cs >>>> <mailto:Khalid.Belhajjame@cs> >>>> <mailto:Khalid.Belhajjame@cs >>>> <mailto:Khalid.Belhajjame@cs>>****. man.ac.uk<http://man.ac.uk> >>>> <http://man.ac.uk> >>>> <mailto:Khalid.Belhajjame@cs >>>> <mailto:Khalid.Belhajjame@cs>. man.ac.uk<http://man.ac.uk> >>>> <mailto:Khalid.Belhajjame@cs. man.ac.uk >>>> <mailto:Khalid.Belhajjame@cs.****man.ac.uk<Khalid.Belhajjame@** >>>> cs.man.ac.uk <Khalid.Belhajjame@cs.man.ac.uk>> >>>> >>>> ] >>>>>>>>> >>>>>>>> >>>> *Sent:* 12 August 2011 10:22 >>>> *To:* Deus, Helena >>>> *Cc:* public-prov-wg@w3.org >>>> <mailto:public-prov-wg@w3.org> >>>> <mailto:public-prov-wg@w3.org<**mailto: >>>> public-prov-wg@w3.org >>>> >>>>> **> >>>>> >>>> <mailto:public-prov-wg@w3.org >>>> <mailto:public-prov-wg@w3.org> >>>> <mailto:public-prov-wg@w3.org >>>> <mailto:public-prov-wg@w3.org>****> > <mailto: >>>> public-prov-wg@w3.org >>>> <mailto:public-prov-wg@w3.org> >>>> <mailto:public-prov-wg@w3.org<**mailto: >>>> public-prov-wg@w3.org >>>> >>>>> **> >>>>> >>>> <mailto:public-prov-wg@w3.org >>>> <mailto:public-prov-wg@w3.org> >>>> <mailto:public-prov-wg@w3.org >>>> <mailto:public-prov-wg@w3.org>****> > > >>>> <mailto:public-prov-wg@w3.org >>>> <mailto:public-prov-wg@w3.org> >>>> <mailto:public-prov-wg@w3.org<**mailto: >>>> public-prov-wg@w3.org >>>> >>>>> **> >>>>> >>>> <mailto:public-prov-wg@w3.org >>>> <mailto:public-prov-wg@w3.org> >>>> <mailto:public-prov-wg@w3.org >>>> <mailto:public-prov-wg@w3.org>****> > <mailto: >>>> public-prov-wg@w3.org >>>> <mailto:public-prov-wg@w3.org> >>>> <mailto:public-prov-wg@w3.org<**mailto: >>>> public-prov-wg@w3.org >>>> >>>>> **> >>>>> >>>> <mailto:public-prov-wg@w3.org >>>> <mailto:public-prov-wg@w3.org> >>>> <mailto:public-prov-wg@w3.org >>>> <mailto:public-prov-wg@w3.org>****> > > > >>>> >>>> >>>> *Subject:* Re: playing with pil ontology >>>> >>>> Hi Helena, >>>> >>>> Thanks for this, I think that this is a good >>>> exercise and >>>> some >>>> of the point you mentioned relate to the >>>> conceptual >>>> model, not >>>> only the formal model. >>>> >>>> On 11/08/2011 18:52, Deus, Helena wrote: >>>> >>>> Hi all, >>>> >>>> Reiterating a bit on what was addressed >>>> today in the >>>> telco, I >>>> downloaded the ontology from mercurial and >>>> tried >>>> to use >>>> it with >>>> my use case. >>>> >>>> I am using the use cases published in [1] >>>> and >>>> demoed with >>>> SPARQL >>>> at http://biordfmicroarray. >>>> googlecode.com/hg/sparql_ >>>> <http://googlecode.com/hg/****sparql_<http://googlecode.com/hg/**sparql_> >>>> <http://googlecode.com/**hg/sparql_ <http://googlecode.com/hg/sparql_> >>>> >> >>>> <http://googlecode.com/hg/ >>>> sparql_<http://googlecode.com/**hg/**sparql_<http://googlecode.com/hg/**sparql_> >>>> <http://**googlecode.com/hg/sparql_ <http://googlecode.com/hg/sparql_>> >>>> >>>>> >>>>>> <http://googlecode.com/hg/ sparql_ >>>> <http://googlecode.com/hg/ sparql_ >>>> <http://googlecode.com/hg/****sparql_<http://googlecode.com/hg/**sparql_> >>>> <http://googlecode.com/**hg/sparql_ <http://googlecode.com/hg/sparql_>> >>>> >>>>> >>>>>>> >>>> endpoint.html >>>> <http://biordfmicroarray. >>>> googlecode.com/hg/sparql_ >>>> <http://googlecode.com/hg/****sparql_<http://googlecode.com/hg/**sparql_> >>>> <http://googlecode.com/**hg/sparql_ <http://googlecode.com/hg/sparql_> >>>> >> >>>> <http://googlecode.com/hg/ >>>> sparql_<http://googlecode.com/**hg/**sparql_<http://googlecode.com/hg/**sparql_> >>>> <http://**googlecode.com/hg/sparql_ <http://googlecode.com/hg/sparql_>> >>>> >>>> >>>>>> >>>> endpoint.html >>>> <http://biordfmicroarray. >>>> googlecode.com/hg/sparql_<http**://googlecode.com/hg/**sparql_<http://googlecode.com/hg/**sparql_> >>>> **<http://googlecode.com/hg/**sparql_<http://googlecode.com/hg/sparql_> >>>> > >>>> >>>>> >>>>> endpoint.html >>>> <http://biordfmicroarray. googlecode.com/hg/sparql_ >>>> endpoint.html >>>> <http://biordfmicroarray.**goo**glecode.com/hg/sparql_**<http://googlecode.com/hg/sparql_**> >>>> endpoint.html<http://**biordfmicroarray.googlecode.** >>>> com/hg/sparql_endpoint.html<http://biordfmicroarray.googlecode.com/hg/sparql_endpoint.html> >>>> > >>>> >>>> >>>>>>>> >>>> Here is my input so far: >>>> >>>> Agent could have dataProperty >>>> “label” and >>>> “description”; it >>>> would help the implementer describe what >>>> type of >>>> agent does >>>> he/she intend to describe. Is the ontology >>>> here being >>>> confused >>>> with the query model? >>>> >>>> I think that there was previously a long >>>> thread >>>> discussion on >>>> agent and agent types, and whether the >>>> model should be >>>> prescriptive in this respect. One of the >>>> solutions >>>> that I >>>> think >>>> many people were happy with is to leave >>>> users >>>> choose their >>>> favorite model(ontology) for agent, which >>>> means >>>> that the >>>> agent >>>> class defined in the ontology acts as a >>>> place >>>> holder that >>>> can be >>>> specialized to include description, types, >>>> and >>>> whatever the >>>> application needs. >>>> >>>> I am not questioning whether agent >>>> should be >>>> mapped to agents >>>> defined elsewhere, which seems to be >>>> obvious– only >>>> wondering >>>> whether agent “label” and “description” are >>>> things >>>> we want to >>>> standardize in our model or not. We can >>>> “suggest” >>>> rdfs:label and >>>> rdfs:comment without enforcing it as such – >>>> having >>>> those >>>> included in the model will likely result in >>>> much less >>>> heterogeneity when it comes to reporting >>>> provenance >>>> (particularly since we are defining it >>>> necessarily >>>> “open” and >>>> highly granular to fit any particular >>>> domain. >>>> >>>> ProvenanceContainer is not useful, >>>> or its >>>> description is not >>>> clear; what should be an instance of >>>> provenanceContainer? >>>> >>>> >>>> At this stage, the description of this >>>> concept is >>>> not yet >>>> stable >>>> in the conceptual model as far as I know. >>>> >>>> What was its intended purpose/role >>>> in the >>>> description of provenance? >>>> >>>> I want to create an instance of a >>>> “untransformed” entity (in my >>>> case, a dataset) and a “transformed” entity. >>>> Is >>>> the model >>>> going >>>> to give me that granularity/expressivity or >>>> do we >>>> expect each >>>> implementer to come up with their own way of >>>> defining these? >>>> >>>> Could you please clarify what you mean by >>>> transformed and >>>> untransformed entity? >>>> >>>> Example: a list of height measurement is an >>>> “untransformed” >>>> entity (a dataset); the average of that >>>> list is the >>>> “transformed” entity (another dataset, >>>> although a very >>>> simple one). >>>> >>>> I am dealing with much more complex >>>> workflows, >>>> (e.g. files >>>> containing the outcome of a microarray >>>> experiment >>>> as the >>>> untransformed dataset and a list of >>>> differentially >>>> expressed >>>> genes as the transformed dataset), so >>>> please take >>>> the example >>>> above is just illustrative. >>>> >>>> ProcessExecution needs more >>>> expressivity, I >>>> think. Not sure how >>>> to solve this in a domain independent way, >>>> but >>>> here’s my >>>> problem: >>>> >>>> An investigator (agent) performs an >>>> experiment >>>> >>>> That experiment has several input >>>> parameters, some of >>>> which are >>>> entities (e.g. samples), other are not (e.g. >>>> temperature). >>>> >>>> Resulting from the experiment are several >>>> output >>>> parameters >>>> (entities) >>>> >>>> >>>> I think that the current model caters for >>>> the >>>> above need. >>>> If you >>>> are specifically trying to differentiate >>>> between >>>> different kinds >>>> of inputs (samples as opposed to >>>> temperature), >>>> then the >>>> notion >>>> of role can be helpful in this resepect. >>>> >>>> So if I understand what you are >>>> saying >>>> correctly, “temperature” >>>> would be an entity of type “input”, which >>>> in turn >>>> would be >>>> subclass of “role”. An instance of “input” >>>> could >>>> then have a >>>> certain value (e.g. 15C) in one of its >>>> properties? >>>> >>>> In that case, does it make sense to include >>>> “input” and >>>> “output” >>>> classes in the model as subclasses of >>>> “role”? Or >>>> is this >>>> something that me and Stephan exemplify in >>>> the primer >>>> document >>>> under “usage of agent” (or something of the >>>> sort)? >>>> >>>> Thanks, khalid >>>> >>>> Have not completed my “experiment” >>>> yet, >>>> but will >>>> provide more >>>> feedback soon J >>>> >>>> Best Regards, >>>> >>>> Helena F. Deus >>>> >>>> Post-doctoral Researcher >>>> Digital Enterprise Research Institute >>>> >>>> National University of Ireland, Galway >>>> >>>> http://lenadeus.info >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>
Received on Thursday, 25 August 2011 15:49:01 UTC