- From: Graham Klyne <graham.klyne@zoo.ox.ac.uk>
- Date: Fri, 25 Nov 2011 12:36:40 +0000
- To: public-prov-wg@w3.org
Simon, This is ultimately about modelling choices, and I think there are reasonably differing positions one might take. Having said my piece, which I think you have understood, I don't feel so deeply about this to continue to argue the case. But there's one thing you say that I'd like to respond to, just in case it underlies a misundersatnding: > However, I still don't see why we would stop people asserting an > actual, affecting relation between entities if they see that to exist > in the world, just as they assert the existence of entities, > activities, generation events etc. based on what exists. In my mind, > derivation seems core to what is meant by provenance, e.g. > understanding why a bottle of wine is as it is, I would like to know > from what grapes it was made, in what land it was grown, what else was > added, etc. My position is not about trying to "stop people asserting [a] ... relation". I'd like users to be able to express whatever forms of derivation they find useful to express. The value of the weak derivation relation I propose is that it acts as a superproperty, a kind of grouping for any or all of these, so that a provenance processor can potentially know that an unknown relation between entities is a kind of derivation, even if it does not know anything more about the nature of such derivation. #g -- On 25/11/2011 11:01, Simon Miles wrote: > Hi Graham, > >> directlyDerivedFrom is assertable based on the observation of a process that >> consumes one and generates the other. (We don't know if the one materially >> affects the value of the other, but that's not what I thought direct derivation >> would be saying.) > > I'm pretty sure the intent was to say more than this. Why would we > want to assert this? As I believe Luc has argued in the past, if an > activity generates B and later uses A, it is not informative to say > that B derives from A, but this is just an extreme example where lack > of affect is most evident. > >> I think it's the nature of the beast that there is little that can be inferred >> from a very generic framework. > > I completely agree, but I'm not arguing for inference, just assertion > of derivation. > >> So while I agree with your comment about the formal nature of a weak derivation >> relationship, I see its value is in the intent that is signals (informally), and >> also that it's a base from which more meaningful derivations can be, er, derived >> - though specialization. And each meaningful derivation property (e.g. >> quotation) may have different formal properties. > > I understand, but I'm not sure why we would have any general > derivation relation at all in that case. It seems to say so little > (unlike, say, wasGeneratedBy) that saying containsQuoteFrom > specialises wasDerivedFrom seems to say nothing at all. The weak > definition means the two entities related by wasDerivedFrom may have > no more connection than two arbitrary entities, surely? > >> <aside> >> A similar argument could be made for first order logic. >> >> In isolation, it expresses nothing - just mathematical structures, albeit more >> complex than transitive closures, but fundamentally no more meaningful. >> >> Its value comes from being a framework in which terms can be associated concepts >> (or things, or values) by extra-logical means, and to hence show how such things >> must be related if they conform to certain logically expressed patterns. >> </aside> > > Well, yes, but those patterns place constraints, which wasDerivedFrom > does not seem to. Moreover, PROV-DM in all other cases goes beyond > expressing nothing, e.g. an activity and an entity have careful > definitions in terms of what is in the world, as do wasGeneratedBy and > used. > >> I'm not seeing how dependedOn is really adding any expressive power. > > I sympathise with that. > > However, I still don't see why we would stop people asserting an > actual, affecting relation between entities if they see that to exist > in the world, just as they assert the existence of entities, > activities, generation events etc. based on what exists. In my mind, > derivation seems core to what is meant by provenance, e.g. > understanding why a bottle of wine is as it is, I would like to know > from what grapes it was made, in what land it was grown, what else was > added, etc. > > Thanks, > Simon > >>> On 19 November 2011 09:58, Graham Klyne<graham.klyne@zoo.ox.ac.uk> wrote: >>>> On 17/11/2011 11:55, Luc Moreau wrote: >>>>> Hi Graham, >>>>> >>>>> The derivation section is indeed complex and needs simplification. >>>>> >>>>> I recently made this proposal >>>>> http://lists.w3.org/Archives/Public/public-prov-wg/2011Nov/0263.html >>>>> >>>>> It differs from yours as follows. >>>>> >>>>> Two derivations: >>>>> - wasDerivedFrom: activity linked >>>>> - wasEventuallyDerivedFrom (replaced by an adequate name) >>>>> >>>>> Simon has made the case that wasEventuallyDerivedFrom is not transitive. I think >>>>> it's reasonable. >>>> >>>> I assume you refer to >>>> http://lists.w3.org/Archives/Public/public-prov-wg/2011Nov/0196.html? >>>> >>>> I suppose it depends on what is the intended intent of "wasEventuallyDerivedFrom". >>>> >>>> I think a distinction here is between "necessarilyDerivedFrom" and >>>> "possiblyDerivedFrom" (modal logic, anyone? :) ). (I'm introducing these terms >>>> for discussion only, I'm not proposing them for use.) >>>> >>>> For me, "e2 possiblyDerivedFrom e1" would be statement that e2 has e1 somewhere >>>> in its derivation history, is easy to understand, and I think this is something >>>> that we would reasonably expect provenance to express. This would be transitive. >>>> >>>> OTOH, "e2 necessarilyDerivedFrom e1" is telling us that the value of e2 is in >>>> some sense materially affected by e1, which I think is taking us into the >>>> territory of what we actually mean by materially affected by - it's a variation >>>> of the problem that concerned me in the first place. I see this is the case >>>> that Simon shows is not transitive. >>>> >>>>> So, what's the difference? wasDerivedFrom is associated with one and only one >>>>> activity. >>>>> wasEventuallyDerivedFrom is unspecific about activities behind this derivation >>>>> (but I believe there is some activity, we just don't know them, nor their number). >>>>> So, wasDerivedFrom would be a special case of wasEventuallyDerivedFrom. >>>> >>>> I'm fine with this, as far as it goes. >>>> >>>>> Several of us have indicated it is useful to have a transitive version. Stian >>>>> has a good >>>>> idea that the transitive version could also include control and wasComplementOf >>>>> (a bit >>>>> like participation was defined, but transitive). >>>>> >>>>> This is a much weaker relation, which states that one entity was in the >>>>> provenance of >>>>> another, essentially. It's not a derivation. >>>>> I would define this in the "Common Relation" section. Not sure how we name this, >>>>> though. >>>> >>>> I agree with the above - this corresponds to my "possiblyDerivedFrom". See >>>> below for naming. >>>> >>>> Your above proposal seems to have "eventuallyDerivedFrom" meaning something more >>>> like my "necessarilyDerivedFrom". >>>> >>>> I would choose the weaker form and the direct form as the "built-in" relations >>>> as they are easier to define in a generic fashion; e.g. directlyDerivedFrom >>>> (strong form) and possiblyDerivedFrom (weak form). Again I'm choosing these >>>> names to emphasize my discussion point, not proposing them here. >>>> >>>> The weaker form can be specialized by applications where there is a need for a >>>> stronger notion of derivation; I don't think we're currently in a position to >>>> say what such a stronger form might be right now. I don't think there is a >>>> single such form that is always applicable. >>>> >>>> ... >>>> >>>> Which brings us to naming. >>>> >>>> I don't want to get too hung up on this, but if the logic above is accepted, I >>>> think it becomes natural to use "derivedFrom" to cover the general (weakest) >>>> case, since that would be the generalization of all other forms of derivation. >>>> For example, it is quite intuitive that "directlyDerivedFrom" (currently just >>>> "derivedFrom") is a specialization of "derivedFrom", and it suggests a naming >>>> pattern that might be useful for other specializations. >>>> >>>> #g >>>> -- >>>> >>>> >>>>> >>>>> On 11/17/2011 11:31 AM, Graham Klyne wrote: >>>>>> I'm reposting and slightly expanding a couple of PROV-DM issues that came up >>>>>> in my review of the primer under a separate subject line. They are related to >>>>>> derivation: >>>>>> >>>>>> http://dvcs.w3.org/hg/prov/raw-file/tip/model/ProvenanceModel.html#Derivation-Relation >>>>>> >>>>>> >>>>>> My understanding of what PROV-DM defines: >>>>>> (a) wasDerivedFrom - activity-linked direct derivation >>>>>> (b) eventuallyDerivedFrom - activity-independent derivation relation with >>>>>> explicit impact on result >>>>>> (c) dependedOn - activity-independent derivation relation possibly without >>>>>> impact on result >>>>>> >>>>>> >>>>>> == Two or three kinds of derivation? == >>>>>> >>>>>> "PROV-DM offers two different forms of derivation records." >>>>>> >>>>>> "The three kinds of derivation records are successively introduced." >>>>>> >>>>>> >>>>>> == eventuallyDerivedFrom vs dependedOn == >>>>>> >>>>>> I have never been particularly comfortable with this attempt to capture the >>>>>> distinction between something that was merely involved and something that >>>>>> actively informed the resulting entity. Philosophically, I think it's a very >>>>>> tricky distinction to draw. Also, it draws us into discussion of what might >>>>>> have been, which is something I understand that provenance is not intended to >>>>>> capture. >>>>>> >>>>>> In the primer example given about "DRAFT FOR REVIEW", maybe its presence does >>>>>> have an effect on the eventual document; if it were not present, the document >>>>>> might have been published without further revision. Who knows? I think there >>>>>> may be cases where the form of contribution is clearer and testable (e.g. >>>>>> becamePartOf), but to simply distinguish between contributory and >>>>>> non-contributory derivation is, I think, rather hard to do. >>>>>> >>>>>> My suggestion would be to drop the distinction, but to allow applications to >>>>>> specialize the property in ways that make sense for the application. >>>>>> >>>>>> >>>>>> == Direct derivation with unspecified action == >>>>>> >>>>>> Is it possible to state that there is a direct derivation relation between two >>>>>> entities by some unspecified (existentially quantified) process execution? >>>>>> >>>>>> I think this is possible using expressions like "wasDerivedFrom(e2,e1)". It is >>>>>> stated, but I found it took some digging out of the text. >>>>>> >>>>>> ... >>>>>> >>>>>> My preference would be to have just two derivation properties: >>>>>> >>>>>> (1) wasDerivedFrom - transitive, activity-independent, account-independent. >>>>>> This would effectively be a superproperty of all derivation relations. >>>>>> (2) wasDirectlyDerivedFrom - non-transitive, activity-dependent (though the >>>>>> activity may be existentially inferred if not specified), and account-dependent. >>>>>> >>>>>> Other application-specific subproperties of wasDerivedFrom could be introduced >>>>>> as needed to capture more directly traceable notions of (esp. multi-step) >>>>>> derivation. >>>>>> >>>>>> (I think this is closer to the original OPM model, which made more sense to me). >>>>>> >>>>>> #g >>>>>> -- >>>>>> >>>>> >>>> >>>> >>> >>> >>> >> > > >
Received on Friday, 25 November 2011 12:37:32 UTC