- From: Paul Groth <p.t.groth@vu.nl>
- Date: Sun, 3 Jun 2012 19:40:20 +0300
- To: Graham Klyne <GK@ninebynine.org>
- Cc: Jun Zhao <jun.zhao@zoo.ox.ac.uk>, Luc Moreau <L.Moreau@ecs.soton.ac.uk>, "public-prov-wg@w3.org" <public-prov-wg@w3.org>
Hi Graham, I would argue that being able to refer to a bundle in which the provenance of an entity is contained is an important piece of functionality to allow people to easily organize their provenance information. I can see the point about trying to reuse the relation between the PAQ and the dm. cheers Paul On Sun, Jun 3, 2012 at 9:48 AM, Graham Klyne <GK@ninebynine.org> wrote: > (I'm replying arbitrarily to Jun's email to maintain the thread, but my comment > is to the issue in general. As it happens, my point about semantics is > underscored by Jun's comment about time constraint - I think it's a non-issue > here, but not obviously so.) > > I think the problem we're running into is that we agreed at the last F2F to > remove all the additional semantics associated with account. Thus, to > paraphrase Simon's excellent summary, a bundle is just a named set of provenance > statements without any further semantics. But it appears that Luc's example > needs more semantics than just a named set of provenance statements - and that's > where I think we are running into problems, because we are not clear about > exactly what those additional semantics should be. > > Therefore I suggest that, according to prior WG agreement, Luc's example is out > of scope for us to fully resolve. Paul's suggestion to provide the attributes > as an extensibility hook is one possible approach. > > Another possible and more radical approach, prompted by Tim's earlier suggestion > to take a local name from DC, is to drop hasProvenanceIn entirely from the prov > specification, and (in the usage guidelines) document use the DC term for this > purpose. This will leave the field clear for subsequent work to define a > suitable cross-bundle primitive when we have a clearer common understanding of > the actual requirements. > > I summary, options that work for me would be (in order of preference): > (1) drop hasProvenanceIn entirely and move on. Use existing terms from other > vocabularies to express this idea. (**) > (2) adopt Paul's suggestion of an extensible 2-place relation (*) > > (*) noting the importance of monotonicity here: extension attributes must not be > able to change semantics of the underlying property. If the underlyong property > has no (formal) semantics, this is easy. If the underlying property does have > built-in semantics, then the utility of the extension may be limited (or worse, > careless extensions may break the underlying semantic model associated with the > core provenance model). > > (**) the slight inconsistency here would be that PROV-AQ still requires a > prov:hasProvenance relation. I'm OK with this because PROV-AQ is intended to > address operational concerns where the model is not. But this does create a > reasonably compelling argument for having a corresponding relation in the model > - if the semantics are minimal then the same relation can work at both levels. > > #g > -- > > > On 02/06/2012 22:36, Jun Zhao wrote: >> Paul, >> >> At first sight, I loved your proposal. But after reading into it, I got less sure. >> >> This property is to allow locating the bundle in which the provenance of an entity is described. To qualify this, would it mean that, e.g, there is a time period during which you can find provenance of that entity in the bundle and after that you can't? >> >> Although the pattern you propose makes sense, I can't see when people need to qualify this relation. If you have a more concrete example in mind, I am ready to be convinced! >> >> Cheers, >> >> Jun >> >> Sent from my iPad (sorry for the brevity) >> >> On 1 Jun 2012, at 17:03, Paul Groth<p.t.groth@vu.nl> wrote: >> >>> Hi All, >>> >>> It seems that a one approach would be to define an extensible version >>> of hasProvenanceIn and leave it at that. >>> >>> hasProvenanceIn(id, entity, bundle, attrs). >>> >>> Like all our extensible relations, we would also have the straight >>> binary version >>> >>> hasProvenanceIn(entity,bundle) >>> >>> This would allow for the extensibility to cater for Luc's use case but >>> also for other use cases where extension is nice. For example, I can >>> imagine a system wanting to put a time constraint on the applicability >>> of provenance in a bundle to an entity. >>> >>> This would leave it up to people to define specialization, alternate >>> and derivation relations between entities as they want. >>> >>> Would this be acceptable to the group? >>> >>> Thanks >>> Paul >>> >>> >>> >>> On Fri, Jun 1, 2012 at 5:33 PM, Luc Moreau<L.Moreau@ecs.soton.ac.uk> wrote: >>>> Hi Simon, >>>> >>>> Thanks for your message. I feel you don't directly respond to the points >>>> that I raised, >>>> and therefore all my comments stand. >>>> >>>> I respond to your points below. >>>> >>>> On 06/01/2012 03:39 PM, Miles, Simon wrote: >>>>> Hi Luc, >>>>> >>>>> I will try to articulate the points which I think back up the binary relations proposal. >>>>> >>>>> 1. As I understood, there is currently no semantics to a bundle. A querier can choose to consider the descriptions in the bundle or not (based on the bundle's provenance), but whether there are one or many bundles, the querier just has a set of PROV descriptions. The bundles need to be found and known to be relevant, which is why hasProvenanceIn (or isTopicOf) is needed. After that, which bundle a description is in is irrelevant and the bundling can be ignored. A specific extension of PROV may change this by adding semantics to bundles, but this is not in the current specification. >>>>> >>>>> >>>> >>>> A close notion to bundle in prior provenance art is opm:Account, and >>>> there is plenty of evidence >>>> that merging accounts may lead to contradictions. PROV, rightly so, >>>> does not define a union operator >>>> over bundles, and is silent about merging or not bundles. >>>> >>>> Therefore, there is nothing in PROV that backs this statement "which >>>> bundle a description is in is >>>> irrelevant and the bundling can be ignored". >>>> >>>> You are suggesting that an extension of PROV may add semantics to >>>> bundles: that's exactly what you >>>> have done, by implying they are mergeable. >>>> >>>>> Taking the statements from the three bundles below, a querier would end up with: >>>>> >>>>> activity(ex:a1, 2011-11-16T16:00:00,2011-11-16T17:0:00) >>>>> wasAssociatedWith(ex:a1,ex:Bob,[prov:role="controller"]) >>>>> activity(ex:a2, 2011-11-17T10:00:00,2011-11-17T17:0:00) >>>>> wasAssociatedWith(ex:a2,ex:Bob,[prov:role="controller"]) >>>>> agent(tool:Bob1, [perf:rating="good"]) >>>>> agent(tool:Bob2, [perf:rating="bad"]) >>>>> >>>>> I can see nothing in the current specification to suggest this means anything different to when these descriptions are separated into multiple bundles. Do you agree? >>>>> >>>>> >>>> >>>> PROV does not specify whether they mean something different or not. >>>> >>>>> 2. If there are two entity identifiers relating to the same thing/entity, we need to say how they are connected: either alternateOf, specializationOf, or possibly some external relation such as owl:sameAs. While the example below happens to imply a specialisation relation between tool:Bob1 and ex:Bob, there is no reason to believe this is true in all cases: alternateOf is just as possible. So, hasProvenanceIn cannot imply or be a sub-type of either specializationOf or alternateOf, the appropriate one must be asserted separately. >>>>> >>>> >>>> I agree that being able to assert subtypes for hasProvenanceIn is >>>> important: that why I am >>>> in favour of having hasProvenanceIn a n-ary relation that includes >>>> attributes so that prov:type can be >>>> used for what you suggest. >>>>> 3. The same thing described from different perspectives has multiple identifiers regardless of bundles, i.e. at least one for each entity. When a bundle is newly read by a querier interested in the provenance of entity E, they should consider every entity E is a specialisation of, and look for those identifiers as well. If they don't, they will miss information about the provenance of E described at a coarser granularity. >>>>> >>>>> For example, ex:Bob may be a specialisation of ex:GeneralBob, and bundle ex:run1 might describe something about ex:GeneralBob's provenance. This makes "hasProvenanceIn(tool:Bob1, ex:run1, ex:Bob)" strange, because it is not only ex:Bob that is relevant to look for in ex:run1. >>>>> >>>>> Separating concerns, I'd argue it is preferable to say: >>>>> hasProvenanceIn(tool:Bob1, ex:run1) >>>>> specializationOf(tool:Bob1, ex:Bob) >>>>> specializationOf(tool:Bob, ex:GeneralBob) >>>>> >>>> But this latter statement would belong to the ex:run1 bundle I assume. >>>> It is not going to be known to be relevant to me until I have correctly >>>> been able to link tool:Bob1 to ex:Bob in run1. >>>> >>>> >>>>> and let the que >> >>
Received on Sunday, 3 June 2012 16:40:50 UTC