- From: Stian Soiland-Reyes <soiland-reyes@cs.manchester.ac.uk>
- Date: Thu, 24 Nov 2011 09:49:39 +0000
- To: Graham Klyne <graham.klyne@zoo.ox.ac.uk>
- Cc: Luc Moreau <L.Moreau@ecs.soton.ac.uk>, public-prov-wg@w3.org
On Sat, Nov 19, 2011 at 08:17, Graham Klyne <graham.klyne@zoo.ox.ac.uk> wrote: > I think a distinction here is between "necessarilyDerivedFrom" and > "possiblyDerivedFrom" (modal logic, anyone? :) ). (I'm introducing these > terms for discussion only, I'm not proposing them for use.) > > For me, "e2 possiblyDerivedFrom e1" would be statement that e2 has e1 > somewhere in its derivation history, is easy to understand, and I think this > is something that we would reasonably expect provenance to express. This > would be transitive. Yes, this is the "e1 is in the provenance past of e2" statement. The path to that past might or might not also be stated - but it can be inferred to exist, with undefined number of activities and possibly complementaries in-between. > OTOH, "e2 necessarilyDerivedFrom e1" is telling us that the value of e2 is > in some sense materially affected by e1, which I think is taking us into the > territory of what we actually mean by materially affected by - it's a > variation of the problem that concerned me in the first place. I see this > is the case that Simon shows is not transitive. Yes, unlike the possiblyDerivedFrom, such an assertion provides new information which could not be inferred by the provenance path between e2 and e1. necessarilyDerivedFrom is a strong statement that e2 *was* affected by e1. This implies also that e2 must have been possiblyDerivedFrom e1 as well, because if e2 was affected by e1, then e1 must appear in its provenance past (stated or not). The nature of what 'affected 'means is not up to us to define, that is up to the asserter. One asserter might think that "DRAFT FOR REVIEW" did actually affect the final product (he has identified a pixel that has survived from that draft) - he can state this with "necessarilyDerivedFrom". Note that this does not imply that there was a single activity that used e1 and generated e2 (I think if you know this, then simply state that activity!) - just that possiblyDerivedFrom(e2,e1) (there was a chain of use/generation/control/dependedOn from e2 leading back to e1) and the semantic meaning that "e2 was influenced by e1". The nature of that influence can be specified by subproperties/qualifiers. Another asserter don't know or is not able to tell if "DRAFT FOR REVIEW" affected the final product, but he knows it was there somewhere in the past, and can state "possiblyDerivedFrom". He does this because a) He does not know all the activities in between, -or- b) He Works on the level of entities rather than activities (data lineage perspective) -or- c) Wants to be 'complete' and have inferred this from stated activity interactions. -or- d) Something I didn't think of - perhaps he made a subproperty that is stronger than possiblyDerivedFrom but not as strong as necessarilyDerivedFrom > I don't want to get too hung up on this, but if the logic above is accepted, > I think it becomes natural to use "derivedFrom" to cover the general > (weakest) case, since that would be the generalization of all other forms of > derivation. For example, it is quite intuitive that "directlyDerivedFrom" > (currently just "derivedFrom") is a specialization of "derivedFrom", and it > suggests a naming pattern that might be useful for other specializations. But "Derived from" implies that it *is* affected (by derivation). If I saw "derivedFrom" and "directlyDerivedFrom" I would interpret these as both being affected - where directlyDerivedFrom is the non-transitive one. I have still no way to express that something was "in the provenance past of" another entity. -- Stian Soiland-Reyes, myGrid team School of Computer Science The University of Manchester
Received on Thursday, 24 November 2011 09:50:29 UTC