W3C home > Mailing lists > Public > public-prov-wg@w3.org > November 2011

Re: PROV-DM derivation concerns arising from my primer review

From: Graham Klyne <Graham.Klyne@zoo.ox.ac.uk>
Date: Fri, 25 Nov 2011 12:56:08 +0000
Message-ID: <4ECF9068.7070706@zoo.ox.ac.uk>
To: Stian Soiland-Reyes <soiland-reyes@cs.manchester.ac.uk>
CC: Luc Moreau <L.Moreau@ecs.soton.ac.uk>, public-prov-wg@w3.org

I agree with most of what you say.

See also my recent reply to Simon on this topic 
(http://lists.w3.org/Archives/Public/public-prov-wg/2011Nov/0395.html) - I'll 
try not to repeat what I say there.

On 24/11/2011 09:49, Stian Soiland-Reyes wrote:
> On Sat, Nov 19, 2011 at 08:17, Graham Klyne<graham.klyne@zoo.ox.ac.uk>  wrote:
>> I think a distinction here is between "necessarilyDerivedFrom" and
>> "possiblyDerivedFrom" (modal logic, anyone? :) ).  (I'm introducing these
>> terms for discussion only, I'm not proposing them for use.)
>> For me, "e2 possiblyDerivedFrom e1" would be statement that e2 has e1
>> somewhere in its derivation history, is easy to understand, and I think this
>> is something that we would reasonably expect provenance to express.  This
>> would be transitive.
> Yes, this is the "e1 is in the provenance past of e2" statement. The
> path to that past might or might not also be stated - but it can be
> inferred to exist, with undefined number of activities and possibly
> complementaries in-between.
>> OTOH, "e2 necessarilyDerivedFrom e1" is telling us that the value of e2 is
>> in some sense materially affected by e1, which I think is taking us into the
>> territory of what we actually mean by materially affected by - it's a
>> variation of the problem that concerned me in the first place.  I see this
>> is the case that Simon shows is not transitive.
> Yes, unlike the possiblyDerivedFrom, such an assertion provides new
> information which could not be inferred by the provenance path between
> e2 and e1.
> necessarilyDerivedFrom is a strong statement that e2 *was* affected by
> e1. This implies also that e2 must have been possiblyDerivedFrom e1 as
> well, because if e2 was affected by e1, then e1 must appear in its
> provenance past (stated or not).
> The nature of what 'affected 'means is not up to us to define, that is
> up to the asserter.
> One asserter might think that "DRAFT FOR REVIEW" did actually affect
> the final product (he has identified a pixel that has survived from
> that draft) - he can state this with "necessarilyDerivedFrom".

You make a good case here for something stronger that the weakest form of 
derivation, by being clear that the precise meaning is in the beief of the 
asserter.  It's a different modality, maybe "believedToBeDerivedFrom"?

> Note that this does not imply that there was a single activity that
> used e1 and generated e2 (I think if you know this, then simply state
> that activity!) - just that possiblyDerivedFrom(e2,e1) (there was a
> chain of use/generation/control/dependedOn from e2 leading back to e1)
> and the semantic meaning that "e2 was influenced by e1". The nature of
> that influence can be specified by subproperties/qualifiers.
> Another asserter don't know or is not able to tell if "DRAFT FOR
> REVIEW" affected the final product, but he knows it was there
> somewhere in the past, and can state "possiblyDerivedFrom". He does
> this because
> a) He does not know all the activities in between,
> -or-
> b) He Works on the level of entities rather than activities (data
> lineage perspective)
> -or-
> c) Wants to be 'complete' and have inferred this from stated activity
> interactions.
> -or-
> d) Something I didn't think of - perhaps he made a subproperty that is
> stronger than possiblyDerivedFrom but not as strong as
> necessarilyDerivedFrom

I'm not sure about (d) if the "necessarilyDerivedFrom" is in the belief of an 
asserter, and has no formal semantics, how can we say it's stronger than 
"possiblyDerivedFrom" (other than as informal claim)?

>> I don't want to get too hung up on this, but if the logic above is accepted,
>> I think it becomes natural to use "derivedFrom" to cover the general
>> (weakest) case, since that would be the generalization of all other forms of
>> derivation. For example, it is quite intuitive that "directlyDerivedFrom"
>> (currently just "derivedFrom") is a specialization of "derivedFrom", and it
>> suggests a naming pattern that might be useful for other specializations.
> But "Derived from" implies that it *is* affected (by derivation). If I
> saw "derivedFrom" and "directlyDerivedFrom" I would interpret these as
> both being affected - where directlyDerivedFrom is the non-transitive
> one. I have still no way to express that something was "in the
> provenance past of" another entity.

Yes, I was suggesting a renaming here.   The text attempts to explain why I 
suggest such a renaming.

Received on Friday, 25 November 2011 12:59:01 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 16:51:04 UTC