W3C home > Mailing lists > Public > public-prov-wg@w3.org > May 2012

RE: PROV-ISSUE-385 (haProvenanceIn-complexity): The hasProvenbanceIn relation is over-complicated [prov-dm]

From: Miles, Simon <simon.miles@kcl.ac.uk>
Date: Tue, 29 May 2012 10:27:09 +0100
To: "public-prov-wg@w3.org" <public-prov-wg@w3.org>
Message-ID: <830EEE5C741ED54EAB28EBACFFC77984EE856F562D@KCL-MAIL04.kclad.ds.kcl.ac.uk>
Hi Luc,

If I'm interpreting the example correctly, I see the following:

 - Bundles ex:run1 and ex:run2 refer to an entity (agent), ex:Bob, at a coarse granularity that does not distinguish between that entity at the times of the different activities. They use the same ID, so it is exactly the same entity referred to.

 - Bundle tool:analysis01 declares two entities, stating each has some equivalence to ex:Bob (currently using hasProvenanceIn) but have distinct attributes.

Surely this seems exactly the case for using specializationOf, as if tool:Bob1 and tool:Bob2 are equivalent to ex:Bob but have different attributes, they must be constrained perspectives on the same thing?

I don't see the problem in the example: ex:Bob is not itself rated good or bad, but specializationOf would tell you that ex:Bob has been rated good and bad in different constrained contexts. What would be wrong with the below?

  entity(tool:Bob1, [perf:rating="good"])
  hasProvenanceIn(tool:Bob1, ex:run1)  // Bob performance in ex:run1 is good
  specializationOf(tool:Bob1, ex:Bob)

  entity(tool:Bob2, [perf:rating="bad"])
  hasProvenanceIn(tool:Bob2, ex:run2)  // Bob performance in ex:run2 is bad
  specializationOf(tool:Bob2, ex:Bob)


Dr Simon Miles
Senior Lecturer, Department of Informatics
Kings College London, WC2R 2LS, UK
+44 (0)20 7848 1166

accounting for the reasons behind contractual violations:

From: Luc Moreau [L.Moreau@ecs.soton.ac.uk]
Sent: 29 May 2012 10:01
To: public-prov-wg@w3.org
Subject: Re: PROV-ISSUE-385 (haProvenanceIn-complexity): The hasProvenbanceIn relation is over-complicated [prov-dm]

Hi Simon and Jim,

Whether we have an id and/or attributes is a secondary question. What we need
to clarify is what the concept are involved in this relation.

In principle, I am in agreement with you. In practice, I don't think we can do
it with the current alternate relation.

Here is my use case:

bundle ex:run1
 activity(ex:a1, 2011-11-16T16:00:00,2011-11-16T17:0:00)  //duration: 1hour

bundle ex:run2
 activity(ex:a2, 2011-11-17T10:00:00,2011-11-17T17:0:00)  //duration: 7hours

bundle tool:analysis01

  entity(tool:Bob1, [perf:rating="good"])
  hasProvenanceIn(tool:Bob1, ex:run1, ex:Bob)  // Bob performance in ex:run1 is good

  entity(tool:Bob2, [perf:rating="bad"])
  hasProvenanceIn(tool:Bob2, ex:run2, ex:Bob)  // Bob performance in ex:run2 is bad


In the bundle tool:analysis01, a tool rates the performance of agents in other bundles.
ex:Bob performance in ex:run1 is good, and bad in ex:run2.

If, as you suggest, I use
instead of
  hasProvenanceIn(tool:Bob1, ex:run1, ex:Bob)
we do not make explicit the context in which ex:Bob is rated.

Simon seems to suggest we could have a specializationOf over bundles.  I think it's too coarse granularity,
and it wouldn't help in this case.


On 05/29/2012 09:40 AM, Miles, Simon wrote:
I tend to agree with Graham and Jim.

The hasProvenanceIn relation is not a description of provenance, it is about locations of provenance data. It seems unnecessary to apply the same rules as for relations describing the past.

In particular, I'm not clear how attributes should be interpreted for hasProvenanceIn: attributes of what? If we mean metadata about the bundle pointed to, e.g. its format, I think this goes beyond provenance and would ideally be left to serialisations to appropriately address.

Having an ID makes sense for entity(), activity(), agent() etc. because we are giving an identifier to something referred to in other PROV descriptions. I'd argue we don't need a general rule of identifying every description, because it's not obviously about provenance and any given serialisation could easily do that where required.

I agree with Jim that it seems important to use alternateOf here: this seems like the situation that specialisationOf/alternateOf were really designed for. Linked bundles with different IDs for the same entity seem most likely where different asserters provide provenance on the same entity or one asserter provides different accounts of the provenance of an entity. Each bundle then takes a particular perspective on the entity. Where we can be more precise and say one bundle's perspective is a specialisationOf the other, that is even better.


Dr Simon Miles
Senior Lecturer, Department of Informatics
Kings College London, WC2R 2LS, UK
+44 (0)20 7848 1166

accounting for the reasons behind contractual violations:

From: Jim McCusker [mccusj@rpi.edu<mailto:mccusj@rpi.edu>]
Sent: 28 May 2012 21:59
To: Luc Moreau
Cc: public-prov-wg@w3.org<mailto:public-prov-wg@w3.org>
Subject: Re: PROV-ISSUE-385 (haProvenanceIn-complexity): The hasProvenbanceIn relation is over-complicated [prov-dm]

Can't that be decomposed into:

alternateOf(alice:report1, ex:report1)


We should focus on re-using constructs rather than implicitly re-introducing them into relations like this. Especially since the idea of a target is entirely optional, as bob and alice may already be using the same URIs.


On Mon, May 28, 2012 at 4:26 PM, Luc Moreau <L.Moreau@ecs.soton.ac.uk<mailto:L.Moreau@ecs.soton.ac.uk>> wrote:
Hi Graham,

Like PROV-AQ, we need a target.
Example 47 illustrates the need for it:

 hasProvenanceIn(alice:report1, bob:bundle4, ex:report1)

In the current bundle, there is a description for alice:report1.
More provenance can be found for it in bundle bob:bundle4, under the name ex:report1.

The presence of attributes and id follow the pattern of other qualified relations.


On 28/05/12 20:01, Provenance Working Group Issue Tracker wrote:
PROV-ISSUE-385 (haProvenanceIn-complexity): The hasProvenbanceIn relation is over-complicated [prov-dm]


Raised by: Graham Klyne
On product: prov-dm

I'm raising this issue as a placeholder and for discussion.  I didn't notice the arrival of prov:hasProvenanceIn, but based on its appearance in http://dvcs.w3.org/hg/prov/raw-file/default/model/releases/ED-prov-dm-20120525/prov-dm.html (which AFAIK is not a currently active draft, but a proposal) is rather over-complicated and a bit obscure.

My sense is that, especially as this is motivated by PROV-AQ, there are just too many identifiers floating around.

Instead of:

  hasProvenanceIn(id, subject, bundle, target, attrs)

Why not just:

  hasProvenanceIn(subject, bundle)

Where subject is based on the URI of an entity, and bundle is based on the URI of a provenance bundle with information about that entity.

I would like to understand what  real scenario justifies all the added machinery that has been included with this relation.

Jim McCusker
Programmer Analyst
Krauthammer Lab, Pathology Informatics
Yale School of Medicine
james.mccusker@yale.edu<mailto:james.mccusker@yale.edu> | (203) 785-6330

PhD Student
Tetherless World Constellation
Rensselaer Polytechnic Institute

Professor Luc Moreau
Electronics and Computer Science   tel:   +44 23 8059 4487
University of Southampton          fax:   +44 23 8059 2865
Southampton SO17 1BJ               email: l.moreau@ecs.soton.ac.uk<mailto:l.moreau@ecs.soton.ac.uk>
United Kingdom                     http://www.ecs.soton.ac.uk/~lavm
Received on Tuesday, 29 May 2012 09:28:53 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 16:51:14 UTC