RE: PROV-ISSUE-249 (two-derivations): Why do we have 3 derivations? [prov-dm]

Luc, Khalid,

I agree with reducing the derivation relations for simplification, and I think Khalid puts the case well. If accounts are just bundles without validity constraints, there is no difference between 1 and n activities. They are just different granularities of description of the same.

I'm not too clear how the imprecise and precise relations now differ. Can't there be different levels of precision (as the provenance becomes less scruffy)? For example, it is more precise to give an identifier for the activity that the derivation is due to, then more precise to give roles of the derivation entities in that activity, then more precise to give attributes of that activity, etc. Each amount of detail asserted allows a more precise reproduction of the activity.

If so, the above could be addressed in different ways. If you would like a concrete proposal: have only one the imprecise form of derivation assertion (wasDerivedFrom), and then a separate assertion to link a derivation to the activity it was due to (wasDerivedBy). All other "precise" information would be asserted using existing record types in the DM.

thanks,
Simon

Dr Simon Miles
Lecturer, Department of Computer Science
Kings College London, WC2R 2LS, UK
+44 (0)20 7848 1166

Modelling the Provenance of Data in Autonomous Systems:
http://eprints.dcs.kcl.ac.uk/1264/
________________________________________
From: Khalid Belhajjame [Khalid.Belhajjame@cs.man.ac.uk]
Sent: 10 February 2012 10:13
To: Provenance Working Group
Cc: Provenance Working Group Issue Tracker
Subject: Re: PROV-ISSUE-249 (two-derivations): Why do we have 3 derivations?  [prov-dm]

+1

I think this proposal will also simplify the model.
The consequence of applying this proposal will also IMO remove some
confusion, by avoiding talking about granularity of the activities
involved in the derivation. In particular, what for one observer can be
imprecise-1, because s/he believes that the activity involved in the
derivation is atomic, can be seen by another observer as imprecise-n,
because s/he believes that the activity involved in the derivation is
composite. Talking simply about precise and imprecise derivation allows
us to avoid this issue.

Khalid

On 09/02/2012 23:11, Provenance Working Group Issue Tracker wrote:
> PROV-ISSUE-249 (two-derivations): Why do we have 3 derivations? [prov-dm]
>
> http://www.w3.org/2011/prov/track/issues/249
>
> Raised by: Luc Moreau
> On product: prov-dm
>
> We currently have 3 derivations:
>
>
> A precise-1 derivation, written wasDerivedFrom(id, e2, e1, a, g2, u1, attrs)
> An imprecise-1 derivation, written wasDerivedFrom(id, e2,e1, t, attrs)
> An imprecise-n derivation, written wasDerivedFrom(id, e2, e1, t, attrs)
>
>
> Imprecise-1/imprecise-1 are distinguished with the attribute prov:steps.
>
> Why do we need 3 derivations?
>
> I believe that imprecise-n derivation is required for the 'scruffy provenance' use case.
>
> I believe that precise-1 derivation is required for the 'proper provenance' use case: in particular, it's a requirement for provenance based reproducibility.
>
> I don't understand why we have imprecise-1.  Why can we just have
> imprecise-n and precise-1?
>
> PS. If we go with this proposal, then they could simply be called imprecise/precise, and we don't need the attribute steps.
>
> PS2. They would essentially be a unqualified and a qualified derivation (in prov-o terminology).
>
>
>
>
>
>

Received on Monday, 13 February 2012 17:30:24 UTC