- From: Graham Klyne <GK@ninebynine.org>
- Date: Thu, 24 Nov 2011 18:05:51 +0000
- To: Simon Miles <simon.miles@kcl.ac.uk>
- CC: Provenance Working Group WG <public-prov-wg@w3.org>
Hi Simon, On 23/11/2011 16:33, Simon Miles wrote: > I think your argument seems to push us to a position where we cannot > assert anything about derivation at all. Yes and no... > If you argue that a "necessarilyDerivedFrom" is not possible to > assert, I can't see why directlyDerivedFrom (your 'strong form') would > be possible either. Either we can know it is "materially affected by" > or we can't. If we can't then the strong form is impossible to assert, > as all we can say is that two entities were somehow involved in a > single activity. directlyDerivedFrom is assertable based on the observation of a process that consumes one and generates the other. (We don't know if the one materially affects the value of the other, but that's not what I thought direct derivation would be saying.) > The weak form is not really about derivation at all, it's just a > transitive closure on a directed graph that happens to contain only > links pointing from future to past. So I find it unconformable to > specialise it for expressing derivation. If the relation I wanted to > express was "containsQuotationFrom", which seems to be a derivation > relation to me, then this is not transitive. I agree, but I'm not seeing the problem here. > Aren't we left being able to express nothing, just do a transitive closure? I think it's the nature of the beast that there is little that can be inferred from a very generic framework. So while I agree with your comment about the formal nature of a weak derivation relationship, I see its value is in the intent that is signals (informally), and also that it's a base from which more meaningful derivations can be, er, derived - though specialization. And each meaningful derivation property (e.g. quotation) may have different formal properties. Another example: GPL licensed code. If C includes portions of B and B includes portions of A and A was GPL licensed, the the GPL licence requirements apply to C, even if C does not contain any actual code from A (assuming B was released by its developer at any point). This example is transitive, due to the particular nature of the GPL licence. The point I'm trying to make is that I don't think there much we can do in the way of expressing interesting conclusions until we start to consider the specific domain-dependent nature of the derivation being considered. My understanding is that PROV is intended to be a domain-neutral framework for assembling provenance traces, the interesting aspects of which may be quite domain-specific. As such, I don't expect PROV alone to express very much, but to be used in conjunction with other more semantically rich concepts to capture complex and important relationships between things. So, I don't see it as a fault if PROV alone expresses nothing. What counts is the part it can play in expressing things that are important. <aside> A similar argument could be made for first order logic. In isolation, it expresses nothing - just mathematical structures, albeit more complex than transitive closures, but fundamentally no more meaningful. Its value comes from being a framework in which terms can be associated concepts (or things, or values) by extra-logical means, and to hence show how such things must be related if they conform to certain logically expressed patterns. </aside> I feel a notion like "dependedOn" has no formal properties, it merely appeals to an ill-described notion of value propagation (I think). Lacking both formal properties *and* a clear explanation of what it is *intended* to mean, I don't see it's adding any value. What's the interoperability story here: if one application says A dependedOn B, what can a second application do with this information? Why not just introduce a domain-specific relationship for which a specific meaning can be given informally, and maybe for which formal consequences can be expressed. Hence my argument to exclude dependedOn (as I understand it), even if the framework thereby expresses nothing, because it (a) helps to keep it simpler, and (b) maintains a separation between inferential machinery and informally described intended interpretations. I'm not seeing how dependedOn is really adding any expressive power. #g -- > On 19 November 2011 09:58, Graham Klyne<graham.klyne@zoo.ox.ac.uk> wrote: >> On 17/11/2011 11:55, Luc Moreau wrote: >>> Hi Graham, >>> >>> The derivation section is indeed complex and needs simplification. >>> >>> I recently made this proposal >>> http://lists.w3.org/Archives/Public/public-prov-wg/2011Nov/0263.html >>> >>> It differs from yours as follows. >>> >>> Two derivations: >>> - wasDerivedFrom: activity linked >>> - wasEventuallyDerivedFrom (replaced by an adequate name) >>> >>> Simon has made the case that wasEventuallyDerivedFrom is not transitive. I think >>> it's reasonable. >> >> I assume you refer to >> http://lists.w3.org/Archives/Public/public-prov-wg/2011Nov/0196.html? >> >> I suppose it depends on what is the intended intent of "wasEventuallyDerivedFrom". >> >> I think a distinction here is between "necessarilyDerivedFrom" and >> "possiblyDerivedFrom" (modal logic, anyone? :) ). (I'm introducing these terms >> for discussion only, I'm not proposing them for use.) >> >> For me, "e2 possiblyDerivedFrom e1" would be statement that e2 has e1 somewhere >> in its derivation history, is easy to understand, and I think this is something >> that we would reasonably expect provenance to express. This would be transitive. >> >> OTOH, "e2 necessarilyDerivedFrom e1" is telling us that the value of e2 is in >> some sense materially affected by e1, which I think is taking us into the >> territory of what we actually mean by materially affected by - it's a variation >> of the problem that concerned me in the first place. I see this is the case >> that Simon shows is not transitive. >> >>> So, what's the difference? wasDerivedFrom is associated with one and only one >>> activity. >>> wasEventuallyDerivedFrom is unspecific about activities behind this derivation >>> (but I believe there is some activity, we just don't know them, nor their number). >>> So, wasDerivedFrom would be a special case of wasEventuallyDerivedFrom. >> >> I'm fine with this, as far as it goes. >> >>> Several of us have indicated it is useful to have a transitive version. Stian >>> has a good >>> idea that the transitive version could also include control and wasComplementOf >>> (a bit >>> like participation was defined, but transitive). >>> >>> This is a much weaker relation, which states that one entity was in the >>> provenance of >>> another, essentially. It's not a derivation. >>> I would define this in the "Common Relation" section. Not sure how we name this, >>> though. >> >> I agree with the above - this corresponds to my "possiblyDerivedFrom". See >> below for naming. >> >> Your above proposal seems to have "eventuallyDerivedFrom" meaning something more >> like my "necessarilyDerivedFrom". >> >> I would choose the weaker form and the direct form as the "built-in" relations >> as they are easier to define in a generic fashion; e.g. directlyDerivedFrom >> (strong form) and possiblyDerivedFrom (weak form). Again I'm choosing these >> names to emphasize my discussion point, not proposing them here. >> >> The weaker form can be specialized by applications where there is a need for a >> stronger notion of derivation; I don't think we're currently in a position to >> say what such a stronger form might be right now. I don't think there is a >> single such form that is always applicable. >> >> ... >> >> Which brings us to naming. >> >> I don't want to get too hung up on this, but if the logic above is accepted, I >> think it becomes natural to use "derivedFrom" to cover the general (weakest) >> case, since that would be the generalization of all other forms of derivation. >> For example, it is quite intuitive that "directlyDerivedFrom" (currently just >> "derivedFrom") is a specialization of "derivedFrom", and it suggests a naming >> pattern that might be useful for other specializations. >> >> #g >> -- >> >> >>> >>> On 11/17/2011 11:31 AM, Graham Klyne wrote: >>>> I'm reposting and slightly expanding a couple of PROV-DM issues that came up >>>> in my review of the primer under a separate subject line. They are related to >>>> derivation: >>>> >>>> http://dvcs.w3.org/hg/prov/raw-file/tip/model/ProvenanceModel.html#Derivation-Relation >>>> >>>> >>>> My understanding of what PROV-DM defines: >>>> (a) wasDerivedFrom - activity-linked direct derivation >>>> (b) eventuallyDerivedFrom - activity-independent derivation relation with >>>> explicit impact on result >>>> (c) dependedOn - activity-independent derivation relation possibly without >>>> impact on result >>>> >>>> >>>> == Two or three kinds of derivation? == >>>> >>>> "PROV-DM offers two different forms of derivation records." >>>> >>>> "The three kinds of derivation records are successively introduced." >>>> >>>> >>>> == eventuallyDerivedFrom vs dependedOn == >>>> >>>> I have never been particularly comfortable with this attempt to capture the >>>> distinction between something that was merely involved and something that >>>> actively informed the resulting entity. Philosophically, I think it's a very >>>> tricky distinction to draw. Also, it draws us into discussion of what might >>>> have been, which is something I understand that provenance is not intended to >>>> capture. >>>> >>>> In the primer example given about "DRAFT FOR REVIEW", maybe its presence does >>>> have an effect on the eventual document; if it were not present, the document >>>> might have been published without further revision. Who knows? I think there >>>> may be cases where the form of contribution is clearer and testable (e.g. >>>> becamePartOf), but to simply distinguish between contributory and >>>> non-contributory derivation is, I think, rather hard to do. >>>> >>>> My suggestion would be to drop the distinction, but to allow applications to >>>> specialize the property in ways that make sense for the application. >>>> >>>> >>>> == Direct derivation with unspecified action == >>>> >>>> Is it possible to state that there is a direct derivation relation between two >>>> entities by some unspecified (existentially quantified) process execution? >>>> >>>> I think this is possible using expressions like "wasDerivedFrom(e2,e1)". It is >>>> stated, but I found it took some digging out of the text. >>>> >>>> ... >>>> >>>> My preference would be to have just two derivation properties: >>>> >>>> (1) wasDerivedFrom - transitive, activity-independent, account-independent. >>>> This would effectively be a superproperty of all derivation relations. >>>> (2) wasDirectlyDerivedFrom - non-transitive, activity-dependent (though the >>>> activity may be existentially inferred if not specified), and account-dependent. >>>> >>>> Other application-specific subproperties of wasDerivedFrom could be introduced >>>> as needed to capture more directly traceable notions of (esp. multi-step) >>>> derivation. >>>> >>>> (I think this is closer to the original OPM model, which made more sense to me). >>>> >>>> #g >>>> -- >>>> >>> >> >> > > >
Received on Friday, 25 November 2011 10:11:33 UTC