- From: Graham Klyne <GK@ninebynine.org>
- Date: Wed, 08 Jun 2011 08:19:21 +0100
- To: Simon Miles <simon.miles@kcl.ac.uk>
- CC: Provenance Working Group WG <public-prov-wg@w3.org>
Hi Simon, Simon Miles wrote: > Hello Graham, > > As Paul says, perspective is not explicitly mentioned in OPM, but it > might be implied. The definition in the OPM spec is: "An account > represents a description at some level of detail as provided by one or > more observers" and, earlier in the document, accounts are said to be > "offering different levels of explanation for [a process] execution" > and that "overlapping accounts are intended to allow various > descriptions of a same execution". I would intuitively interpret the > distinction between accounts to be about perspective, particularly by > being from different "observers" or at different "levels". Yes, this is what I was thinking. It does not explicitly allow for accounts of different "perspectives" (*) - the implication being different levels of granularity on the same perspective (i.e. the same underlying information, or the same "invariant"). All the examples I recall were likewise directed. (*) by my understanding of "perspective", which is a consideration of what information is considered relevant for some particular purpose. > With regards to comparing accounts of the same process, I would assume > they are from different perspectives, else why have multiple accounts? > I don't think there's any reason to require perspectives to be > incomparable. One OPM graph can express both a coarse-grained account > and fine-grained account of the same process, which means you could > express a query (graph traversal) using details from both accounts. Suppose we're talking about the provenance of a document - when it was authored, by whom, etc. If the author edits the document, from an authorship perspective the proveamnce does not change, but from a temporal perspective it does. This kind of difference is not covered by simply different levels of detail. This is why I say that I think that OPM implicitly assumes a common perspective. This is mainly to test my understanding of the OPM concept - I'm not sure that this is especially important for the current work items (thoug may become so later). #g -- > On 6 June 2011 11:24, Paul Groth <pgroth@gmail.com> wrote: >> Hi Graham, >> >> >From my understanding OPM doesn't say anything about the prospective. >> An account is a coloring of the graph with some operation on that >> coloring. >> >> It doesn't say who or what an account is from. >> >> cheers, >> Paul >> >> >> On Mon, Jun 6, 2011 at 9:01 AM, Graham Klyne <GK@ninebynine.org> wrote: >>> I'm wondering if the use of "account" here is exactly the same as the use of >>> "account" in OPM. I guess Luc would know best. >>> >>> Specifically, when we talk of or compare multiple accounts of some process >>> of information production, do we require them to all be from the same >>> perspective? I think that may be what OPM assumes. Maybe it doesn't matter, >>> but if there's scope for confusion I figure we should at least be aware of >>> it. >>> >>> #g >>> -- >>> >>> Simon Miles wrote: >>>> I think "invariant" is good too. >>>> >>>> I was unclear, regarding the proposal to focus on "values/things that >>>> are immutable according to some perspective or viewpoint", whether it >>>> is the latter "values" for which we determine provenance or state >>>> derivation relationships, or whether the "values" are properties of >>>> the entities which have provenance and there other mutable (variant) >>>> values? >>>> >>>> If only for my own understanding, I tried looking across the different >>>> threads on this list. Here's my interpretation of what has been >>>> implied in terms of definitions (but I might well be misinterpreting). >>>> >>>> An entity is something identifiable. >>>> An account is a record of something that has occurred from a >>>> particular perspective. >>>> An invariant property of an entity is a property of that entity which >>>> is invariant according to a particular perspective. >>>> An abstraction of an entity is another entity with a subset of its >>>> invariant properties, according to a particular perspective. >>>> B derives from A if some of B's invariant properties are due to A's >>>> invariant properties. >>>> >>>> An example trying to capture all the above: >>>> >>>> Entities: >>>> - E1: A government data set with UK government identifier GOVID-12345 >>>> - E2: The data set with a data value for row 2012 being £7,500 >>>> - E3: The corrected data set with the value for row 2012 being £9,000 >>>> - E4: An Excel 2010 spreadsheet containing the corrected data set >>>> >>>> Accounts: >>>> - A1: An account from a perspective in which any government data set >>>> will always retain the same UK government identifier (a new identifier >>>> means a new data set) >>>> - A2: An account from a perspective in which any change of value in a >>>> data set means it is a new version of that data set >>>> - A3: An account from a perspective in which any changes to a file by >>>> writing create a new data set, while any changes due to reading do not >>>> >>>> Invariant properties: >>>> - P1: Identifier GOVID-12345 is invariant for E1, E2, E3, E4 with >>>> respect to account A1 >>>> - P2: All the data values (including £7,500 for 2012) are invariant >>>> for E2 with respect to account A2 >>>> - P3: All the data values (including £9,000 for 2012) are invariant >>>> for E3, E4 with respect to account A2 >>>> - P4: All bytes of the spreadsheet are invariant for E4 except those >>>> changed on reading (e.g. Excel saves the current open worksheet, >>>> cursor position etc. even without editing) with respect to account A3 >>>> - P5: The data set (E1) having existed is invariant for E1, E2, E3, >>>> E4 with respect to any account >>>> - P6: The first version of the data set (E2) having existed is >>>> invariant for E2 with respect to any account >>>> - P7: The corrected version of the data set (E3) having existed is >>>> invariant for E3, E4 with respect to any account >>>> - P8: The Excel data set (E4) having existed is invariant for E4 with >>>> respect to any account >>>> >>>> Abstractions: >>>> - E1 abstracts E2, E3, E4 >>>> - E3 abstracts E4 >>>> >>>> Derivation: >>>> - E3 derives from E2 because, aside from the corrected value, all >>>> other values are copied directly from it (P3 is partly due to P2) >>>> - E3 also derives from the correction made to the data set, changing >>>> £7,500 to £9,000 (could be called E5, omitted above for brevity) >>>> >>>> We could then say that the provenance of an entity is/includes a >>>> record of how that entity came to have its invariant properties. >>>> >>>> Provenance: >>>> - Provenance of E1 is how it came to be generated (P5) and came to >>>> have its ID (P1) >>>> - Provenance of E2 is how it came to be generated (P5, P6), given its >>>> ID (P1), and populated with the data it has (P2) >>>> - Provenance of E3 is how it came to be generated (P5, P7), given its >>>> ID (P1), and populated with the data it has (P3) >>>> - Provenance of E4 is how it came to be generated (P5, P7, P8), given >>>> its ID (P1), populated with the data it has (P3), and serialised to >>>> its given bytes (P4) >>>> >>>> It would be good to know if others are interpreting the consensus in >>>> the same way! >>>> >>>> Thanks, >>>> Simon >>>> >>>> On 3 June 2011 21:36, Luc Moreau <L.Moreau@ecs.soton.ac.uk> wrote: >>>>> I think I am also comfortable with using the term "invariant", if it >>>>> helps gain consensus. >>>>> >>>>> >>>>> >>>>> Professor Luc Moreau >>>>> Electronics and Computer Science >>>>> University of Southampton >>>>> Southampton SO17 1BJ >>>>> United Kingdom >>>>> >>>>> On 3 Jun 2011, at 15:06, "Graham Klyne" <GK@ninebynine.org> wrote: >>>>> >>>>>> Luc, >>>>>> Jim, >>>>>> Khalid, >>>>>> >>>>>> I'm responding to all of you at once. >>>>>> >>>>>> Short answer: what Luc says. >>>>>> >>>>>> I find myself preferring the term "invariant" to "immutable" for just >>>>>> this reason. >>>>>> >>>>>> ... >>>>>> >>>>>> Longer answer: there's not a specific thing I want to capture through >>>>>> derivation of mutual resources. I'm just concerned that insisting on >>>>>> immutability may prevent useful expression. >>>>>> >>>>>> I'll illustrate with an example from a completely different field. For >>>>>> some years, I have been involved peripherally in definition and registration >>>>>> of URI schemes, and remain IANA's designated reviewer for new URI schemes. >>>>>> Several years ago, there's was much discussion about registering new URI >>>>>> schemes vs registering new URN namespaces [2] vs using http URIs for >>>>>> everything. A specific example is the info: URI scheme [3]. I argued at >>>>>> the time that this could equally served by a URN namespace. But the >>>>>> original definition of URN requirements [4] made some apparently strong >>>>>> assertions about persistence and permanance of URNs which the community >>>>>> behind info felt were too constraining, so we ended up with an arguable >>>>>> unnecessary new URI scheme. Some further history at [5]. >>>>>> >>>>>> Looking back, I now think the original language in [4] was >>>>>> over-interpreted, and many people didn't fully recognize that permanence of >>>>>> identity didn't constrain the identified thing itself possibly changing or >>>>>> going away. There was an expectation of immutability, not even explicitly >>>>>> stated, but also not dispelled. >>>>>> >>>>>> This is the kind of concern I have with insisting on immutability in >>>>>> subjects of provenance at the outset. >>>>>> >>>>>> [1] http://www.ietf.org/rfc/rfc2141.txt >>>>>> [2] http://www.ietf.org/rfc/rfc2611.txt, >>>>>> http://tools.ietf.org/html/rfc3406 >>>>>> [3] http://www.ietf.org/rfc/rfc4452.txt >>>>>> [4] http://tools.ietf.org/html/rfc1737 >>>>>> [5] http://www.w3.org/TR/uri-clarification/ >>>>>> >>>>>> #g >>>>>> -- >>>>>> >>>>>> Luc Moreau wrote: >>>>>>> Hi Jim, Graham, Klyne, >>>>>>> Following yesterday's call, and seeing this thread, it seems that >>>>>>> "Immutable value" is too restrictive because too absolute. >>>>>>> What about saying we focus on "/values/things that are immutable >>>>>>> according to some perspective or viewpoint/"? >>>>>>> It seems to offer the necessary trade-off and flexibility, with >>>>>>> - a stable property required for provenance >>>>>>> - change being allowed according to other viewpoints. >>>>>>> Cheers, >>>>>>> Luc >>>>>>> On 06/03/2011 02:03 AM, Myers, Jim wrote: >>>>>>>> What do you want to capture with derivation of mutable resources? >>>>>>>> Simply that one mutable resource can be used in a process and produce >>>>>>>> another different mutable resouirce? If so, I'd ask why we should consider >>>>>>>> this case any different than immutable? (Does the fact that most of what we >>>>>>>> want to call immutable resources are undergoing constant change (bits >>>>>>>> getting refresh charges, files moving about in memory caches, etc.) cause >>>>>>>> any issue with the basic OPM-style model? I think all of these cases are >>>>>>>> handled just fine by OPM-style constructs and I'd argue further that the key >>>>>>>> concept about artifacts was not complete immutability with respect to any >>>>>>>> process we can think of but immutability with respect to the processes >>>>>>>> involved in the provenance (Eggs used in cake baking do not come out as >>>>>>>> modified eggs (they become a new cake), but an egg in the fridge and the >>>>>>>> warmer egg waiting to be mixed are considered the same egg only because we >>>>>>>> don't want to discuss/report on the wa >>> rming process that occurred. The fact that an egg has mutability in its >>> temperature doesn't make it a bad artifact in OPM or cause trouble in >>> reporting a baking process...) >>>>>>>> The mutable case that presents a question is should we provide a >>>>>>>> second mechanism to allow one to describe a process that changes the state >>>>>>>> of a mutable resource?-to say that egg with temperaturcold is the same egg >>>>>>>> with temperature warm after a heating process. I suspect that we can't avoid >>>>>>>> this use case completely but we might not have to create a separate >>>>>>>> mechanism: If we allow a resource egg to be associated with cold-egg and >>>>>>>> warm-egg resources, we can use the OPM like mechanism (cold-egg <-- heating >>>>>>>> <-- warm-egg) while adding cold-egg and warm-egg are 'aspectsof" the same >>>>>>>> mutable egg which 'participates' in a heating process. I think this is >>>>>>>> general and minimally disruptive. One could say that an egg participated in >>>>>>>> heating without creating other resources, but one could not directly >>>>>>>> describe the temperature of the egg before and after heating without >>>>>>>> creating the cold and warm egg artifacts. I think this also covers what we >>>>>>>> want from agents and sources - we wan >>> t to convey that they participate in a process and, while their state >>> changes as they do so, we don't want to document their state changes. But as >>> Simon says we may still want to treat them (e.g. the Royal Society) as >>> resources and talk about their creation so it would be valuable if they >>> could just be artifacts in the context of creation/founding type events. >>> Today, we have agents and sources as different types than artifact so there >>> is no way to talk about their founding, etc. >>>>>>>> -- Jim >>>>>>>> >>>>>>>> ________________________________ >>>>>>>> >>>>>>>> From: public-prov-wg-request@w3.org >>>>>>>> <mailto:public-prov-wg-request@w3.org> on behalf of Graham Klyne >>>>>>>> Sent: Thu 6/2/2011 3:45 PM >>>>>>>> To: Khalid Belhajjame >>>>>>>> Cc: Luc Moreau; public-prov-wg@w3.org <mailto:public-prov-wg@w3.org> >>>>>>>> Subject: Re: PROV-ISSUE-7 (define-derivation): Definition for Concept >>>>>>>> 'Derivation' [Provenance Terminology] >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> Khalid Belhajjame wrote: >>>>>>>> >>>>>>>>> Hi Graham, >>>>>>>>> >>>>>>>>>> I agree that many of the examples of derivation we have raised >>>>>>>>>> relate >>>>>>>>> to resource states. But if, as has been suggested by myself and >>>>>>>>> others, >>>>>>>>> resource states are themselves resources >(especially when named for >>>>>>>>> the >>>>>>>>> purposes of expressing a derivation), then such derivations can >>>>>>>>> equally >>>>>>>>> be regarded as relating resources. I think that's more a difference >>>>>>>>> of >>>>>>>>> terminology than >fundamental. >>>>>>>>> >>>>>>>>> Would it be fair then to say that in that view resources are >>>>>>>>> immutable >>>>>>>>> resources? >>>>>>>>> >>>>>>>> In the case of resources representing a snapshot of state, yes. >>>>>>>> >>>>>>>> >>>>>>>>> Which bring me to the question, do we want to express derivations >>>>>>>>> between mutable resources, or that is just something that we should >>>>>>>>> avoid at this point? >>>>>>>>> >>>>>>>> (I'm finishing this email after today's telecon, so it's a bit of a >>>>>>>> re-run.) >>>>>>>> >>>>>>>> I think that many of our use-cases are based on invariant values, and >>>>>>>> the >>>>>>>> near-term goal is to find expression for these. So we definitely do >>>>>>>> want to >>>>>>>> express derivations between non-varying values. But in so doing, it's >>>>>>>> not clear >>>>>>>> to me (yet) that we need to exclude mutable resources, so I say let's >>>>>>>> keep our >>>>>>>> options open and not close off any possibilities that we don't have >>>>>>>> to. >>>>>>>> >>>>>>>> So my answer to avoiding mutable resources is: "yes and no". >>>>>>>> >>>>>>>> #g >>>>>>>> -- >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>>> Thanks, khalid >>>>>>>>> >>>>>>>>> >>>>>>>>>> Where I think I may diverge from what you say is that I would not >>>>>>>>>> limit such expressions of derivation to resources that happen to be >>>>>>>>>> a >>>>>>>>>> state (or snapshot of state) of some resource. I think defining >>>>>>>>>> that >>>>>>>>>> distinction in a hard-and-fast way, that also aligns with various >>>>>>>>>> intuitions we may have about derivation, may prove difficult to >>>>>>>>>> achieve (e.g. as I think is suggested by Jim Meyers in >>>>>>>>>> http://lists.w3.org/Archives/Public/public-prov-wg/2011Jun/0015.html >>>>>>>>>> (*)). >>>>>>>>>> >>>>>>>>>> #g >>>>>>>>>> -- >>>>>>>>>> >>>>>>>>>> (*) I just love the W3C mailing list archives - so easy to find >>>>>>>>>> links >>>>>>>>>> to messages, and thus capture provenance! >>>>>>>>>> >>>>>>>>>> Khalid Belhajjame wrote: >>>>>>>>>> >>>>>>>>>>> Hi, >>>>>>>>>>> >>>>>>>>>>> From the discussion so far on derivation it seems that most people >>>>>>>>>>> tend to define derivation between resource states or resources >>>>>>>>>>> state >>>>>>>>>>> representations, but not for resources. >>>>>>>>>>> >>>>>>>>>>> My take on this is that in a context where a resource is mutable, >>>>>>>>>>> derivations will mainly be used to associate resource states and >>>>>>>>>>> resource states representations. >>>>>>>>>>> >>>>>>>>>>> That said, based on derivations connecting resource states and >>>>>>>>>>> resources state representations, one can infer new derivations >>>>>>>>>>> between resources. For example, consider the resource r_1 and the >>>>>>>>>>> associated resource state r_1_s, and consider that r_1_s was used >>>>>>>>>>> to >>>>>>>>>>> construct a new resource state r_2_s, actually the first state, of >>>>>>>>>>> the resource r2. We can state that r_2_s is derived from r_1_s, >>>>>>>>>>> i.e., >>>>>>>>>>> r_1_s -> r_2_s. We can also state that the resource r_2 is derived >>>>>>>>>>> from the resource r_1, i.e., r_1 -> r_2 >>>>>>>>>>> >>>>>>>>>>> PS: I added a defintiion of derivation within this lines to the >>>>>>>>>>> wiki: >>>>>>>>>>> http://www.w3.org/2011/prov/wiki/ConceptDerivation >>>>>>>>>>> >>>>>>>>>>> Thanks, khalid >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 01/06/2011 07:49, Luc Moreau wrote: >>>>>>>>>>> >>>>>>>>>>>> Hi Graham, >>>>>>>>>>>> >>>>>>>>>>>> Isn't it that you used the duri scheme to name the two resource >>>>>>>>>>>> states that exist in >>>>>>>>>>>> this scenario? >>>>>>>>>>>> >>>>>>>>>>>> In your view of the web, is there a notion of stateful resource? >>>>>>>>>>>> Does it apply here? >>>>>>>>>>>> >>>>>>>>>>>> Thanks, >>>>>>>>>>>> Luc >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>> On 31/05/11 23:57, Graham Klyne wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Luc Moreau wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> Graham, >>>>>>>>>>>>>> >>>>>>>>>>>>>> In my example, I really mean for the two versions of the chart >>>>>>>>>>>>>> to >>>>>>>>>>>>>> be available at >>>>>>>>>>>>>> the same URI. (So, definitely, an uncool URI!) >>>>>>>>>>>>>> >>>>>>>>>>>>>> In that case, there is a *single* resource, but it is stateful. >>>>>>>>>>>>>> Hence, there >>>>>>>>>>>>>> are two *resource states*, one generated using (stats2), and the >>>>>>>>>>>>>> other using (stats3). >>>>>>>>>>>>>> >>>>>>>>>>>>> Luc, >>>>>>>>>>>>> >>>>>>>>>>>>> I had interpreted your scenario as using a common URI as you >>>>>>>>>>>>> explain. >>>>>>>>>>>>> >>>>>>>>>>>>> But there are still several resources here, but they are not all >>>>>>>>>>>>> exposed on the web or assigned URIs. I'm appealing here to >>>>>>>>>>>>> anything that *might* be identified as opposed to things that >>>>>>>>>>>>> actually are assigned URIs. (For example, the proposed duri: >>>>>>>>>>>>> scheme might be used - >>>>>>>>>>>>> http://tools.ietf.org/id/draft-masinter-dated-uri-07.html) >>>>>>>>>>>>> >>>>>>>>>>>>> (And the URI is perfectly "cool" if it is specifically intended >>>>>>>>>>>>> to >>>>>>>>>>>>> denote a dynamic resource. A URI used to access the current >>>>>>>>>>>>> weather in London can be stable if properly managed.) >>>>>>>>>>>>> >>>>>>>>>>>>> (I think this is all entirely consistent with my earlier stated >>>>>>>>>>>>> positions.) >>>>>>>>>>>>> >>>>>>>>>>>>> #g >>>>>>>>>>>>> -- >>>>>>>>>>>>> >>>>>>>>>>>>> >>>>>>>>>>>>>> Of course, if blogger had used cool uris, then, c2s2 and c2s3 >>>>>>>>>>>>>> would be different resources. >>>>>>>>>>>>>> >>>>>>>>>>>>>> Luc >>>>>>>>>>>>>> >>>>>>>>>>>>>> On 05/31/2011 02:25 PM, Graham Klyne wrote: >>>>>>>>>>>>>> >>>>>>>>>>>>>>> I see (at least) two resources associated with (c2): one >>>>>>>>>>>>>>> generated using (stats2), and other using (stats3). We might >>>>>>>>>>>>>>> call these (c2s2) and (c2s3). >>>>>>>>>>>>>>> >>>>>>>> >>>>>>> -- >>>>>>> Professor Luc Moreau Electronics and Computer Science >>>>>>> tel: +44 23 8059 4487 University of Southampton fax: >>>>>>> +44 23 8059 2865 Southampton SO17 1BJ email: >>>>>>> l.moreau@ecs.soton.ac.uk <mailto:l.moreau@ecs.soton.ac.uk> United Kingdom >>>>>>> http://www.ecs.soton.ac.uk/~lavm >>>>> ______________________________________________________________________ >>>>> This email has been scanned by the MessageLabs Email Security System. >>>>> For more information please visit http://www.messagelabs.com/email >>>>> ______________________________________________________________________ >>>>> >>>> >>>> >>> >>> >> >> ______________________________________________________________________ >> This email has been scanned by the MessageLabs Email Security System. >> For more information please visit http://www.messagelabs.com/email >> ______________________________________________________________________ >> > > >
Received on Wednesday, 8 June 2011 11:12:54 UTC