- From: Paul Groth <pgroth@gmail.com>
- Date: Mon, 6 Jun 2011 12:21:47 +0200
- To: Graham Klyne <GK@ninebynine.org>
- Cc: Provenance Working Group WG <public-prov-wg@w3.org>
Hi Graham, >From my understanding OPM doesn't say anything about the prospective. An account is a coloring of the graph with some operation on that coloring. It doesn't say who or what an account is from. cheers, Paul On Mon, Jun 6, 2011 at 9:01 AM, Graham Klyne <GK@ninebynine.org> wrote: > I'm wondering if the use of "account" here is exactly the same as the use of > "account" in OPM. I guess Luc would know best. > > Specifically, when we talk of or compare multiple accounts of some process > of information production, do we require them to all be from the same > perspective? I think that may be what OPM assumes. Maybe it doesn't matter, > but if there's scope for confusion I figure we should at least be aware of > it. > > #g > -- > > Simon Miles wrote: >> >> I think "invariant" is good too. >> >> I was unclear, regarding the proposal to focus on "values/things that >> are immutable according to some perspective or viewpoint", whether it >> is the latter "values" for which we determine provenance or state >> derivation relationships, or whether the "values" are properties of >> the entities which have provenance and there other mutable (variant) >> values? >> >> If only for my own understanding, I tried looking across the different >> threads on this list. Here's my interpretation of what has been >> implied in terms of definitions (but I might well be misinterpreting). >> >> An entity is something identifiable. >> An account is a record of something that has occurred from a >> particular perspective. >> An invariant property of an entity is a property of that entity which >> is invariant according to a particular perspective. >> An abstraction of an entity is another entity with a subset of its >> invariant properties, according to a particular perspective. >> B derives from A if some of B's invariant properties are due to A's >> invariant properties. >> >> An example trying to capture all the above: >> >> Entities: >> - E1: A government data set with UK government identifier GOVID-12345 >> - E2: The data set with a data value for row 2012 being £7,500 >> - E3: The corrected data set with the value for row 2012 being £9,000 >> - E4: An Excel 2010 spreadsheet containing the corrected data set >> >> Accounts: >> - A1: An account from a perspective in which any government data set >> will always retain the same UK government identifier (a new identifier >> means a new data set) >> - A2: An account from a perspective in which any change of value in a >> data set means it is a new version of that data set >> - A3: An account from a perspective in which any changes to a file by >> writing create a new data set, while any changes due to reading do not >> >> Invariant properties: >> - P1: Identifier GOVID-12345 is invariant for E1, E2, E3, E4 with >> respect to account A1 >> - P2: All the data values (including £7,500 for 2012) are invariant >> for E2 with respect to account A2 >> - P3: All the data values (including £9,000 for 2012) are invariant >> for E3, E4 with respect to account A2 >> - P4: All bytes of the spreadsheet are invariant for E4 except those >> changed on reading (e.g. Excel saves the current open worksheet, >> cursor position etc. even without editing) with respect to account A3 >> - P5: The data set (E1) having existed is invariant for E1, E2, E3, >> E4 with respect to any account >> - P6: The first version of the data set (E2) having existed is >> invariant for E2 with respect to any account >> - P7: The corrected version of the data set (E3) having existed is >> invariant for E3, E4 with respect to any account >> - P8: The Excel data set (E4) having existed is invariant for E4 with >> respect to any account >> >> Abstractions: >> - E1 abstracts E2, E3, E4 >> - E3 abstracts E4 >> >> Derivation: >> - E3 derives from E2 because, aside from the corrected value, all >> other values are copied directly from it (P3 is partly due to P2) >> - E3 also derives from the correction made to the data set, changing >> £7,500 to £9,000 (could be called E5, omitted above for brevity) >> >> We could then say that the provenance of an entity is/includes a >> record of how that entity came to have its invariant properties. >> >> Provenance: >> - Provenance of E1 is how it came to be generated (P5) and came to >> have its ID (P1) >> - Provenance of E2 is how it came to be generated (P5, P6), given its >> ID (P1), and populated with the data it has (P2) >> - Provenance of E3 is how it came to be generated (P5, P7), given its >> ID (P1), and populated with the data it has (P3) >> - Provenance of E4 is how it came to be generated (P5, P7, P8), given >> its ID (P1), populated with the data it has (P3), and serialised to >> its given bytes (P4) >> >> It would be good to know if others are interpreting the consensus in >> the same way! >> >> Thanks, >> Simon >> >> On 3 June 2011 21:36, Luc Moreau <L.Moreau@ecs.soton.ac.uk> wrote: >>> >>> I think I am also comfortable with using the term "invariant", if it >>> helps gain consensus. >>> >>> >>> >>> Professor Luc Moreau >>> Electronics and Computer Science >>> University of Southampton >>> Southampton SO17 1BJ >>> United Kingdom >>> >>> On 3 Jun 2011, at 15:06, "Graham Klyne" <GK@ninebynine.org> wrote: >>> >>>> Luc, >>>> Jim, >>>> Khalid, >>>> >>>> I'm responding to all of you at once. >>>> >>>> Short answer: what Luc says. >>>> >>>> I find myself preferring the term "invariant" to "immutable" for just >>>> this reason. >>>> >>>> ... >>>> >>>> Longer answer: there's not a specific thing I want to capture through >>>> derivation of mutual resources. I'm just concerned that insisting on >>>> immutability may prevent useful expression. >>>> >>>> I'll illustrate with an example from a completely different field. For >>>> some years, I have been involved peripherally in definition and registration >>>> of URI schemes, and remain IANA's designated reviewer for new URI schemes. >>>> Several years ago, there's was much discussion about registering new URI >>>> schemes vs registering new URN namespaces [2] vs using http URIs for >>>> everything. A specific example is the info: URI scheme [3]. I argued at >>>> the time that this could equally served by a URN namespace. But the >>>> original definition of URN requirements [4] made some apparently strong >>>> assertions about persistence and permanance of URNs which the community >>>> behind info felt were too constraining, so we ended up with an arguable >>>> unnecessary new URI scheme. Some further history at [5]. >>>> >>>> Looking back, I now think the original language in [4] was >>>> over-interpreted, and many people didn't fully recognize that permanence of >>>> identity didn't constrain the identified thing itself possibly changing or >>>> going away. There was an expectation of immutability, not even explicitly >>>> stated, but also not dispelled. >>>> >>>> This is the kind of concern I have with insisting on immutability in >>>> subjects of provenance at the outset. >>>> >>>> [1] http://www.ietf.org/rfc/rfc2141.txt >>>> [2] http://www.ietf.org/rfc/rfc2611.txt, >>>> http://tools.ietf.org/html/rfc3406 >>>> [3] http://www.ietf.org/rfc/rfc4452.txt >>>> [4] http://tools.ietf.org/html/rfc1737 >>>> [5] http://www.w3.org/TR/uri-clarification/ >>>> >>>> #g >>>> -- >>>> >>>> Luc Moreau wrote: >>>>> >>>>> Hi Jim, Graham, Klyne, >>>>> Following yesterday's call, and seeing this thread, it seems that >>>>> "Immutable value" is too restrictive because too absolute. >>>>> What about saying we focus on "/values/things that are immutable >>>>> according to some perspective or viewpoint/"? >>>>> It seems to offer the necessary trade-off and flexibility, with >>>>> - a stable property required for provenance >>>>> - change being allowed according to other viewpoints. >>>>> Cheers, >>>>> Luc >>>>> On 06/03/2011 02:03 AM, Myers, Jim wrote: >>>>>> >>>>>> What do you want to capture with derivation of mutable resources? >>>>>> Simply that one mutable resource can be used in a process and produce >>>>>> another different mutable resouirce? If so, I'd ask why we should consider >>>>>> this case any different than immutable? (Does the fact that most of what we >>>>>> want to call immutable resources are undergoing constant change (bits >>>>>> getting refresh charges, files moving about in memory caches, etc.) cause >>>>>> any issue with the basic OPM-style model? I think all of these cases are >>>>>> handled just fine by OPM-style constructs and I'd argue further that the key >>>>>> concept about artifacts was not complete immutability with respect to any >>>>>> process we can think of but immutability with respect to the processes >>>>>> involved in the provenance (Eggs used in cake baking do not come out as >>>>>> modified eggs (they become a new cake), but an egg in the fridge and the >>>>>> warmer egg waiting to be mixed are considered the same egg only because we >>>>>> don't want to discuss/report on the wa > > rming process that occurred. The fact that an egg has mutability in its > temperature doesn't make it a bad artifact in OPM or cause trouble in > reporting a baking process...) >>>>>> >>>>>> The mutable case that presents a question is should we provide a >>>>>> second mechanism to allow one to describe a process that changes the state >>>>>> of a mutable resource?-to say that egg with temperaturcold is the same egg >>>>>> with temperature warm after a heating process. I suspect that we can't avoid >>>>>> this use case completely but we might not have to create a separate >>>>>> mechanism: If we allow a resource egg to be associated with cold-egg and >>>>>> warm-egg resources, we can use the OPM like mechanism (cold-egg <-- heating >>>>>> <-- warm-egg) while adding cold-egg and warm-egg are 'aspectsof" the same >>>>>> mutable egg which 'participates' in a heating process. I think this is >>>>>> general and minimally disruptive. One could say that an egg participated in >>>>>> heating without creating other resources, but one could not directly >>>>>> describe the temperature of the egg before and after heating without >>>>>> creating the cold and warm egg artifacts. I think this also covers what we >>>>>> want from agents and sources - we wan > > t to convey that they participate in a process and, while their state > changes as they do so, we don't want to document their state changes. But as > Simon says we may still want to treat them (e.g. the Royal Society) as > resources and talk about their creation so it would be valuable if they > could just be artifacts in the context of creation/founding type events. > Today, we have agents and sources as different types than artifact so there > is no way to talk about their founding, etc. >>>>>> >>>>>> -- Jim >>>>>> >>>>>> ________________________________ >>>>>> >>>>>> From: public-prov-wg-request@w3.org >>>>>> <mailto:public-prov-wg-request@w3.org> on behalf of Graham Klyne >>>>>> Sent: Thu 6/2/2011 3:45 PM >>>>>> To: Khalid Belhajjame >>>>>> Cc: Luc Moreau; public-prov-wg@w3.org <mailto:public-prov-wg@w3.org> >>>>>> Subject: Re: PROV-ISSUE-7 (define-derivation): Definition for Concept >>>>>> 'Derivation' [Provenance Terminology] >>>>>> >>>>>> >>>>>> >>>>>> Khalid Belhajjame wrote: >>>>>> >>>>>>> Hi Graham, >>>>>>> >>>>>>>> I agree that many of the examples of derivation we have raised >>>>>>>> relate >>>>>>> >>>>>>> to resource states. But if, as has been suggested by myself and >>>>>>> others, >>>>>>> resource states are themselves resources >(especially when named for >>>>>>> the >>>>>>> purposes of expressing a derivation), then such derivations can >>>>>>> equally >>>>>>> be regarded as relating resources. I think that's more a difference >>>>>>> of >>>>>>> terminology than >fundamental. >>>>>>> >>>>>>> Would it be fair then to say that in that view resources are >>>>>>> immutable >>>>>>> resources? >>>>>>> >>>>>> In the case of resources representing a snapshot of state, yes. >>>>>> >>>>>> >>>>>>> Which bring me to the question, do we want to express derivations >>>>>>> between mutable resources, or that is just something that we should >>>>>>> avoid at this point? >>>>>>> >>>>>> (I'm finishing this email after today's telecon, so it's a bit of a >>>>>> re-run.) >>>>>> >>>>>> I think that many of our use-cases are based on invariant values, and >>>>>> the >>>>>> near-term goal is to find expression for these. So we definitely do >>>>>> want to >>>>>> express derivations between non-varying values. But in so doing, it's >>>>>> not clear >>>>>> to me (yet) that we need to exclude mutable resources, so I say let's >>>>>> keep our >>>>>> options open and not close off any possibilities that we don't have >>>>>> to. >>>>>> >>>>>> So my answer to avoiding mutable resources is: "yes and no". >>>>>> >>>>>> #g >>>>>> -- >>>>>> >>>>>> >>>>>> >>>>>>> Thanks, khalid >>>>>>> >>>>>>> >>>>>>>> Where I think I may diverge from what you say is that I would not >>>>>>>> limit such expressions of derivation to resources that happen to be >>>>>>>> a >>>>>>>> state (or snapshot of state) of some resource. I think defining >>>>>>>> that >>>>>>>> distinction in a hard-and-fast way, that also aligns with various >>>>>>>> intuitions we may have about derivation, may prove difficult to >>>>>>>> achieve (e.g. as I think is suggested by Jim Meyers in >>>>>>>> http://lists.w3.org/Archives/Public/public-prov-wg/2011Jun/0015.html >>>>>>>> (*)). >>>>>>>> >>>>>>>> #g >>>>>>>> -- >>>>>>>> >>>>>>>> (*) I just love the W3C mailing list archives - so easy to find >>>>>>>> links >>>>>>>> to messages, and thus capture provenance! >>>>>>>> >>>>>>>> Khalid Belhajjame wrote: >>>>>>>> >>>>>>>>> Hi, >>>>>>>>> >>>>>>>>> From the discussion so far on derivation it seems that most people >>>>>>>>> tend to define derivation between resource states or resources >>>>>>>>> state >>>>>>>>> representations, but not for resources. >>>>>>>>> >>>>>>>>> My take on this is that in a context where a resource is mutable, >>>>>>>>> derivations will mainly be used to associate resource states and >>>>>>>>> resource states representations. >>>>>>>>> >>>>>>>>> That said, based on derivations connecting resource states and >>>>>>>>> resources state representations, one can infer new derivations >>>>>>>>> between resources. For example, consider the resource r_1 and the >>>>>>>>> associated resource state r_1_s, and consider that r_1_s was used >>>>>>>>> to >>>>>>>>> construct a new resource state r_2_s, actually the first state, of >>>>>>>>> the resource r2. We can state that r_2_s is derived from r_1_s, >>>>>>>>> i.e., >>>>>>>>> r_1_s -> r_2_s. We can also state that the resource r_2 is derived >>>>>>>>> from the resource r_1, i.e., r_1 -> r_2 >>>>>>>>> >>>>>>>>> PS: I added a defintiion of derivation within this lines to the >>>>>>>>> wiki: >>>>>>>>> http://www.w3.org/2011/prov/wiki/ConceptDerivation >>>>>>>>> >>>>>>>>> Thanks, khalid >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> On 01/06/2011 07:49, Luc Moreau wrote: >>>>>>>>> >>>>>>>>>> Hi Graham, >>>>>>>>>> >>>>>>>>>> Isn't it that you used the duri scheme to name the two resource >>>>>>>>>> states that exist in >>>>>>>>>> this scenario? >>>>>>>>>> >>>>>>>>>> In your view of the web, is there a notion of stateful resource? >>>>>>>>>> Does it apply here? >>>>>>>>>> >>>>>>>>>> Thanks, >>>>>>>>>> Luc >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 31/05/11 23:57, Graham Klyne wrote: >>>>>>>>>> >>>>>>>>>>> Luc Moreau wrote: >>>>>>>>>>> >>>>>>>>>>>> Graham, >>>>>>>>>>>> >>>>>>>>>>>> In my example, I really mean for the two versions of the chart >>>>>>>>>>>> to >>>>>>>>>>>> be available at >>>>>>>>>>>> the same URI. (So, definitely, an uncool URI!) >>>>>>>>>>>> >>>>>>>>>>>> In that case, there is a *single* resource, but it is stateful. >>>>>>>>>>>> Hence, there >>>>>>>>>>>> are two *resource states*, one generated using (stats2), and the >>>>>>>>>>>> other using (stats3). >>>>>>>>>>>> >>>>>>>>>>> Luc, >>>>>>>>>>> >>>>>>>>>>> I had interpreted your scenario as using a common URI as you >>>>>>>>>>> explain. >>>>>>>>>>> >>>>>>>>>>> But there are still several resources here, but they are not all >>>>>>>>>>> exposed on the web or assigned URIs. I'm appealing here to >>>>>>>>>>> anything that *might* be identified as opposed to things that >>>>>>>>>>> actually are assigned URIs. (For example, the proposed duri: >>>>>>>>>>> scheme might be used - >>>>>>>>>>> http://tools.ietf.org/id/draft-masinter-dated-uri-07.html) >>>>>>>>>>> >>>>>>>>>>> (And the URI is perfectly "cool" if it is specifically intended >>>>>>>>>>> to >>>>>>>>>>> denote a dynamic resource. A URI used to access the current >>>>>>>>>>> weather in London can be stable if properly managed.) >>>>>>>>>>> >>>>>>>>>>> (I think this is all entirely consistent with my earlier stated >>>>>>>>>>> positions.) >>>>>>>>>>> >>>>>>>>>>> #g >>>>>>>>>>> -- >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>>> Of course, if blogger had used cool uris, then, c2s2 and c2s3 >>>>>>>>>>>> would be different resources. >>>>>>>>>>>> >>>>>>>>>>>> Luc >>>>>>>>>>>> >>>>>>>>>>>> On 05/31/2011 02:25 PM, Graham Klyne wrote: >>>>>>>>>>>> >>>>>>>>>>>>> I see (at least) two resources associated with (c2): one >>>>>>>>>>>>> generated using (stats2), and other using (stats3). We might >>>>>>>>>>>>> call these (c2s2) and (c2s3). >>>>>>>>>>>>> >>>>>> >>>>>> >>>>> -- >>>>> Professor Luc Moreau Electronics and Computer Science >>>>> tel: +44 23 8059 4487 University of Southampton fax: >>>>> +44 23 8059 2865 Southampton SO17 1BJ email: >>>>> l.moreau@ecs.soton.ac.uk <mailto:l.moreau@ecs.soton.ac.uk> United Kingdom >>>>> http://www.ecs.soton.ac.uk/~lavm >>> >>> ______________________________________________________________________ >>> This email has been scanned by the MessageLabs Email Security System. >>> For more information please visit http://www.messagelabs.com/email >>> ______________________________________________________________________ >>> >> >> >> > > >
Received on Monday, 6 June 2011 10:22:17 UTC