- From: Luc Moreau <L.Moreau@ecs.soton.ac.uk>
- Date: Fri, 01 Jun 2012 16:33:41 +0100
- To: public-prov-wg@w3.org
Hi Simon, Thanks for your message. I feel you don't directly respond to the points that I raised, and therefore all my comments stand. I respond to your points below. On 06/01/2012 03:39 PM, Miles, Simon wrote: > Hi Luc, > > I will try to articulate the points which I think back up the binary relations proposal. > > 1. As I understood, there is currently no semantics to a bundle. A querier can choose to consider the descriptions in the bundle or not (based on the bundle's provenance), but whether there are one or many bundles, the querier just has a set of PROV descriptions. The bundles need to be found and known to be relevant, which is why hasProvenanceIn (or isTopicOf) is needed. After that, which bundle a description is in is irrelevant and the bundling can be ignored. A specific extension of PROV may change this by adding semantics to bundles, but this is not in the current specification. > > A close notion to bundle in prior provenance art is opm:Account, and there is plenty of evidence that merging accounts may lead to contradictions. PROV, rightly so, does not define a union operator over bundles, and is silent about merging or not bundles. Therefore, there is nothing in PROV that backs this statement "which bundle a description is in is irrelevant and the bundling can be ignored". You are suggesting that an extension of PROV may add semantics to bundles: that's exactly what you have done, by implying they are mergeable. > Taking the statements from the three bundles below, a querier would end up with: > > activity(ex:a1, 2011-11-16T16:00:00,2011-11-16T17:0:00) > wasAssociatedWith(ex:a1,ex:Bob,[prov:role="controller"]) > activity(ex:a2, 2011-11-17T10:00:00,2011-11-17T17:0:00) > wasAssociatedWith(ex:a2,ex:Bob,[prov:role="controller"]) > agent(tool:Bob1, [perf:rating="good"]) > agent(tool:Bob2, [perf:rating="bad"]) > > I can see nothing in the current specification to suggest this means anything different to when these descriptions are separated into multiple bundles. Do you agree? > > PROV does not specify whether they mean something different or not. > 2. If there are two entity identifiers relating to the same thing/entity, we need to say how they are connected: either alternateOf, specializationOf, or possibly some external relation such as owl:sameAs. While the example below happens to imply a specialisation relation between tool:Bob1 and ex:Bob, there is no reason to believe this is true in all cases: alternateOf is just as possible. So, hasProvenanceIn cannot imply or be a sub-type of either specializationOf or alternateOf, the appropriate one must be asserted separately. > I agree that being able to assert subtypes for hasProvenanceIn is important: that why I am in favour of having hasProvenanceIn a n-ary relation that includes attributes so that prov:type can be used for what you suggest. > 3. The same thing described from different perspectives has multiple identifiers regardless of bundles, i.e. at least one for each entity. When a bundle is newly read by a querier interested in the provenance of entity E, they should consider every entity E is a specialisation of, and look for those identifiers as well. If they don't, they will miss information about the provenance of E described at a coarser granularity. > > For example, ex:Bob may be a specialisation of ex:GeneralBob, and bundle ex:run1 might describe something about ex:GeneralBob's provenance. This makes "hasProvenanceIn(tool:Bob1, ex:run1, ex:Bob)" strange, because it is not only ex:Bob that is relevant to look for in ex:run1. > > Separating concerns, I'd argue it is preferable to say: > hasProvenanceIn(tool:Bob1, ex:run1) > specializationOf(tool:Bob1, ex:Bob) > specializationOf(tool:Bob, ex:GeneralBob) > But this latter statement would belong to the ex:run1 bundle I assume. It is not going to be known to be relevant to me until I have correctly been able to link tool:Bob1 to ex:Bob in run1. > and let the querier search ex:run1 for all identifiers relevant to the entity. It seems irrelevant that the identifier tool:Bob1 is itself absent from bundle ex:run1, as it is only one of many identifiers for the entity/thing anyway. > > Paraphrasing Paul from the telecon, hasProvenanceIn(tool:Bob1, ex:run1) can just mean "look in ex:run1 for more stuff relevant to tool:Bob1". If you know that tool:Bob1 is a specialisation of ex:Bob, then you should also look for ex:Bob. > I prefer Tim's interpretation tool:Bob1 is a topic in ex:run1, but I am saying that it is not a topic in ex:run1, ex:Bob is. There is an aliasing issue happening here. 1. If when generating ex:run1 and ex:run2, I had known about the profiling tool, I could have generated instance of ex:bob1 and ex:bob2, so that they can be individually assessed. But that's not the way things work. We reuse identifiers. 2. I had assessed only one instance of ex:Bob in my tool bundle, then I could have reused the same identifier ex:Bob and hasProvenanceIn(ex:Bob, ex:run1) would have been sufficient. It is only because I want to talk about two different specializations of ex:Bob in the tool bundle that I am forced to change the identifiers. It is an aliasing issue. My objection for a binary hasProvenanceIn(subject,bundle) is that it is not extensible in PROV. I cannot subtype it, and I cannot have (a standardized or not) way of handling the aliasing. Luc > Thanks, > Simon > > Dr Simon Miles > Senior Lecturer, Department of Informatics > Kings College London, WC2R 2LS, UK > +44 (0)20 7848 1166 > > accounting for the reasons behind contractual violations: > http://eprints.dcs.kcl.ac.uk/1283/ > ________________________________________ > From: Luc Moreau [L.Moreau@ecs.soton.ac.uk] > Sent: 31 May 2012 22:54 > To: Provenance Working Group WG > Subject: ISSUE-385: hasProvenanceIn: finding a solution > > All, > > To try and converge towards a solution, I am > circulating an example using a ternary hasProvenanceIn. > I would like to understand if and how we can make it work with > a simpler relation. > > > Two bundles ex:run1 and ex:run2 describe bob's role as a controller > of two activities. Same bob, two different bundles. > > bundle ex:run1 > activity(ex:a1, 2011-11-16T16:00:00,2011-11-16T17:0:00) > //duration: 1hour > wasAssociatedWith(ex:a1,ex:Bob,[prov:role="controller"]) > endBundle > > bundle ex:run2 > activity(ex:a2, 2011-11-17T10:00:00,2011-11-17T17:0:00) > //duration: 7hours > wasAssociatedWith(ex:a2,ex:Bob,[prov:role="controller"]) > endBundle > > > A performance analysis tool rates the performance of agents (this could > be used > to dispatch further work to performant agents, or congratulate them, etc). > > > bundle tool:analysis01 > > agent(tool:Bob1, [perf:rating="good"]) > hasProvenanceIn(tool:Bob1, ex:run1, ex:Bob) // Bob performance > in ex:run1 is good > > agent(tool:Bob2, [perf:rating="bad"]) > hasProvenanceIn(tool:Bob2, ex:run2, ex:Bob) // Bob performance > in ex:run2 is bad > > endBundle > > The performance analysis tool has to rate two involvements of ex:Bob in > two separate activities. > Two specialized version of ex:Bob are defined: tool:bob1 and tool:bob2, > with rating good and > bad respectively. > > tool:Bob1 is linked to ex:Bob in run1, and tool:Bob2 is linked to ex:Bob > in run2, with the following > > hasProvenanceIn(tool:Bob1, ex:run1, ex:Bob) > hasProvenanceIn(tool:Bob2, ex:run2, ex:Bob) > > Nothing is expressed about ex:Bob in bundle tool:analysis01 (except that > this is an alias > for tool:Bob1 and tool:Bob2). > > It is suggested that the ternary relation could be replaced by > isTopicIn(tool:Bob1, ex:run1) > and > specialization(tool:Bob1, ex:Bob). > > I don't understand the point of > isTopicIn(tool:Bob1, ex:run1) > since tool:Bob1 is not a topic in ex:run1. > > Also, we now seem to have made ex:Bob a topic of tool:analysis01, because > the following expression. > specialization(tool:Bob1, ex:Bob). > > From tool:analysis01, where do I find provenance about ex:Bob? > It look like this has become a dead end in this graph. > > Do I need to introduce: > isTopicIn(ex:Bob, ex:run1) > isTopicIn(ex:Bob, ex:run2)? > > > So now we would have: > isTopicIn(tool:Bob1, ex:run1) > specialization(tool:Bob1, ex:Bob) > isTopicIn(tool:Bob2, ex:run2) > specialization(tool:Bob2, ex:Bob) > isTopicIn(ex:Bob, ex:run1) > isTopicIn(ex:Bob, ex:run2) > > Which means that: > > specialization(tool:Bob1, ex:Bob) > isTopicIn(ex:Bob, ex:run2) > > ... would lead us to believe that good rating is due to slow performance. > > Can the proposer of the separate binary relations explain how this > example can work? > > Thanks, > Luc > -- Professor Luc Moreau Electronics and Computer Science tel: +44 23 8059 4487 University of Southampton fax: +44 23 8059 2865 Southampton SO17 1BJ email: l.moreau@ecs.soton.ac.uk United Kingdom http://www.ecs.soton.ac.uk/~lavm
Received on Friday, 1 June 2012 15:34:20 UTC