- From: Simon Miles <simon.miles@kcl.ac.uk>
- Date: Tue, 7 Jun 2011 20:46:06 +0100
- To: Provenance Working Group WG <public-prov-wg@w3.org>
Hello Graham, As Paul says, perspective is not explicitly mentioned in OPM, but it might be implied. The definition in the OPM spec is: "An account represents a description at some level of detail as provided by one or more observers" and, earlier in the document, accounts are said to be "offering different levels of explanation for [a process] execution" and that "overlapping accounts are intended to allow various descriptions of a same execution". I would intuitively interpret the distinction between accounts to be about perspective, particularly by being from different "observers" or at different "levels". With regards to comparing accounts of the same process, I would assume they are from different perspectives, else why have multiple accounts? I don't think there's any reason to require perspectives to be incomparable. One OPM graph can express both a coarse-grained account and fine-grained account of the same process, which means you could express a query (graph traversal) using details from both accounts. Thanks, Simon On 6 June 2011 11:24, Paul Groth <pgroth@gmail.com> wrote: > Hi Graham, > > >From my understanding OPM doesn't say anything about the prospective. > An account is a coloring of the graph with some operation on that > coloring. > > It doesn't say who or what an account is from. > > cheers, > Paul > > > On Mon, Jun 6, 2011 at 9:01 AM, Graham Klyne <GK@ninebynine.org> wrote: >> I'm wondering if the use of "account" here is exactly the same as the use of >> "account" in OPM. I guess Luc would know best. >> >> Specifically, when we talk of or compare multiple accounts of some process >> of information production, do we require them to all be from the same >> perspective? I think that may be what OPM assumes. Maybe it doesn't matter, >> but if there's scope for confusion I figure we should at least be aware of >> it. >> >> #g >> -- >> >> Simon Miles wrote: >>> >>> I think "invariant" is good too. >>> >>> I was unclear, regarding the proposal to focus on "values/things that >>> are immutable according to some perspective or viewpoint", whether it >>> is the latter "values" for which we determine provenance or state >>> derivation relationships, or whether the "values" are properties of >>> the entities which have provenance and there other mutable (variant) >>> values? >>> >>> If only for my own understanding, I tried looking across the different >>> threads on this list. Here's my interpretation of what has been >>> implied in terms of definitions (but I might well be misinterpreting). >>> >>> An entity is something identifiable. >>> An account is a record of something that has occurred from a >>> particular perspective. >>> An invariant property of an entity is a property of that entity which >>> is invariant according to a particular perspective. >>> An abstraction of an entity is another entity with a subset of its >>> invariant properties, according to a particular perspective. >>> B derives from A if some of B's invariant properties are due to A's >>> invariant properties. >>> >>> An example trying to capture all the above: >>> >>> Entities: >>> - E1: A government data set with UK government identifier GOVID-12345 >>> - E2: The data set with a data value for row 2012 being £7,500 >>> - E3: The corrected data set with the value for row 2012 being £9,000 >>> - E4: An Excel 2010 spreadsheet containing the corrected data set >>> >>> Accounts: >>> - A1: An account from a perspective in which any government data set >>> will always retain the same UK government identifier (a new identifier >>> means a new data set) >>> - A2: An account from a perspective in which any change of value in a >>> data set means it is a new version of that data set >>> - A3: An account from a perspective in which any changes to a file by >>> writing create a new data set, while any changes due to reading do not >>> >>> Invariant properties: >>> - P1: Identifier GOVID-12345 is invariant for E1, E2, E3, E4 with >>> respect to account A1 >>> - P2: All the data values (including £7,500 for 2012) are invariant >>> for E2 with respect to account A2 >>> - P3: All the data values (including £9,000 for 2012) are invariant >>> for E3, E4 with respect to account A2 >>> - P4: All bytes of the spreadsheet are invariant for E4 except those >>> changed on reading (e.g. Excel saves the current open worksheet, >>> cursor position etc. even without editing) with respect to account A3 >>> - P5: The data set (E1) having existed is invariant for E1, E2, E3, >>> E4 with respect to any account >>> - P6: The first version of the data set (E2) having existed is >>> invariant for E2 with respect to any account >>> - P7: The corrected version of the data set (E3) having existed is >>> invariant for E3, E4 with respect to any account >>> - P8: The Excel data set (E4) having existed is invariant for E4 with >>> respect to any account >>> >>> Abstractions: >>> - E1 abstracts E2, E3, E4 >>> - E3 abstracts E4 >>> >>> Derivation: >>> - E3 derives from E2 because, aside from the corrected value, all >>> other values are copied directly from it (P3 is partly due to P2) >>> - E3 also derives from the correction made to the data set, changing >>> £7,500 to £9,000 (could be called E5, omitted above for brevity) >>> >>> We could then say that the provenance of an entity is/includes a >>> record of how that entity came to have its invariant properties. >>> >>> Provenance: >>> - Provenance of E1 is how it came to be generated (P5) and came to >>> have its ID (P1) >>> - Provenance of E2 is how it came to be generated (P5, P6), given its >>> ID (P1), and populated with the data it has (P2) >>> - Provenance of E3 is how it came to be generated (P5, P7), given its >>> ID (P1), and populated with the data it has (P3) >>> - Provenance of E4 is how it came to be generated (P5, P7, P8), given >>> its ID (P1), populated with the data it has (P3), and serialised to >>> its given bytes (P4) >>> >>> It would be good to know if others are interpreting the consensus in >>> the same way! >>> >>> Thanks, >>> Simon >>> >>> On 3 June 2011 21:36, Luc Moreau <L.Moreau@ecs.soton.ac.uk> wrote: >>>> >>>> I think I am also comfortable with using the term "invariant", if it >>>> helps gain consensus. >>>> >>>> >>>> >>>> Professor Luc Moreau >>>> Electronics and Computer Science >>>> University of Southampton >>>> Southampton SO17 1BJ >>>> United Kingdom >>>> >>>> On 3 Jun 2011, at 15:06, "Graham Klyne" <GK@ninebynine.org> wrote: >>>> >>>>> Luc, >>>>> Jim, >>>>> Khalid, >>>>> >>>>> I'm responding to all of you at once. >>>>> >>>>> Short answer: what Luc says. >>>>> >>>>> I find myself preferring the term "invariant" to "immutable" for just >>>>> this reason. >>>>> >>>>> ... >>>>> >>>>> Longer answer: there's not a specific thing I want to capture through >>>>> derivation of mutual resources. I'm just concerned that insisting on >>>>> immutability may prevent useful expression. >>>>> >>>>> I'll illustrate with an example from a completely different field. For >>>>> some years, I have been involved peripherally in definition and registration >>>>> of URI schemes, and remain IANA's designated reviewer for new URI schemes. >>>>> Several years ago, there's was much discussion about registering new URI >>>>> schemes vs registering new URN namespaces [2] vs using http URIs for >>>>> everything. A specific example is the info: URI scheme [3]. I argued at >>>>> the time that this could equally served by a URN namespace. But the >>>>> original definition of URN requirements [4] made some apparently strong >>>>> assertions about persistence and permanance of URNs which the community >>>>> behind info felt were too constraining, so we ended up with an arguable >>>>> unnecessary new URI scheme. Some further history at [5]. >>>>> >>>>> Looking back, I now think the original language in [4] was >>>>> over-interpreted, and many people didn't fully recognize that permanence of >>>>> identity didn't constrain the identified thing itself possibly changing or >>>>> going away. There was an expectation of immutability, not even explicitly >>>>> stated, but also not dispelled. >>>>> >>>>> This is the kind of concern I have with insisting on immutability in >>>>> subjects of provenance at the outset. >>>>> >>>>> [1] http://www.ietf.org/rfc/rfc2141.txt >>>>> [2] http://www.ietf.org/rfc/rfc2611.txt, >>>>> http://tools.ietf.org/html/rfc3406 >>>>> [3] http://www.ietf.org/rfc/rfc4452.txt >>>>> [4] http://tools.ietf.org/html/rfc1737 >>>>> [5] http://www.w3.org/TR/uri-clarification/ >>>>> >>>>> #g >>>>> -- >>>>> >>>>> Luc Moreau wrote: >>>>>> >>>>>> Hi Jim, Graham, Klyne, >>>>>> Following yesterday's call, and seeing this thread, it seems that >>>>>> "Immutable value" is too restrictive because too absolute. >>>>>> What about saying we focus on "/values/things that are immutable >>>>>> according to some perspective or viewpoint/"? >>>>>> It seems to offer the necessary trade-off and flexibility, with >>>>>> - a stable property required for provenance >>>>>> - change being allowed according to other viewpoints. >>>>>> Cheers, >>>>>> Luc >>>>>> On 06/03/2011 02:03 AM, Myers, Jim wrote: >>>>>>> >>>>>>> What do you want to capture with derivation of mutable resources? >>>>>>> Simply that one mutable resource can be used in a process and produce >>>>>>> another different mutable resouirce? If so, I'd ask why we should consider >>>>>>> this case any different than immutable? (Does the fact that most of what we >>>>>>> want to call immutable resources are undergoing constant change (bits >>>>>>> getting refresh charges, files moving about in memory caches, etc.) cause >>>>>>> any issue with the basic OPM-style model? I think all of these cases are >>>>>>> handled just fine by OPM-style constructs and I'd argue further that the key >>>>>>> concept about artifacts was not complete immutability with respect to any >>>>>>> process we can think of but immutability with respect to the processes >>>>>>> involved in the provenance (Eggs used in cake baking do not come out as >>>>>>> modified eggs (they become a new cake), but an egg in the fridge and the >>>>>>> warmer egg waiting to be mixed are considered the same egg only because we >>>>>>> don't want to discuss/report on the wa >> >> rming process that occurred. The fact that an egg has mutability in its >> temperature doesn't make it a bad artifact in OPM or cause trouble in >> reporting a baking process...) >>>>>>> >>>>>>> The mutable case that presents a question is should we provide a >>>>>>> second mechanism to allow one to describe a process that changes the state >>>>>>> of a mutable resource?-to say that egg with temperaturcold is the same egg >>>>>>> with temperature warm after a heating process. I suspect that we can't avoid >>>>>>> this use case completely but we might not have to create a separate >>>>>>> mechanism: If we allow a resource egg to be associated with cold-egg and >>>>>>> warm-egg resources, we can use the OPM like mechanism (cold-egg <-- heating >>>>>>> <-- warm-egg) while adding cold-egg and warm-egg are 'aspectsof" the same >>>>>>> mutable egg which 'participates' in a heating process. I think this is >>>>>>> general and minimally disruptive. One could say that an egg participated in >>>>>>> heating without creating other resources, but one could not directly >>>>>>> describe the temperature of the egg before and after heating without >>>>>>> creating the cold and warm egg artifacts. I think this also covers what we >>>>>>> want from agents and sources - we wan >> >> t to convey that they participate in a process and, while their state >> changes as they do so, we don't want to document their state changes. But as >> Simon says we may still want to treat them (e.g. the Royal Society) as >> resources and talk about their creation so it would be valuable if they >> could just be artifacts in the context of creation/founding type events. >> Today, we have agents and sources as different types than artifact so there >> is no way to talk about their founding, etc. >>>>>>> >>>>>>> -- Jim >>>>>>> >>>>>>> ________________________________ >>>>>>> >>>>>>> From: public-prov-wg-request@w3.org >>>>>>> <mailto:public-prov-wg-request@w3.org> on behalf of Graham Klyne >>>>>>> Sent: Thu 6/2/2011 3:45 PM >>>>>>> To: Khalid Belhajjame >>>>>>> Cc: Luc Moreau; public-prov-wg@w3.org <mailto:public-prov-wg@w3.org> >>>>>>> Subject: Re: PROV-ISSUE-7 (define-derivation): Definition for Concept >>>>>>> 'Derivation' [Provenance Terminology] >>>>>>> >>>>>>> >>>>>>> >>>>>>> Khalid Belhajjame wrote: >>>>>>> >>>>>>>> Hi Graham, >>>>>>>> >>>>>>>>> I agree that many of the examples of derivation we have raised >>>>>>>>> relate >>>>>>>> >>>>>>>> to resource states. But if, as has been suggested by myself and >>>>>>>> others, >>>>>>>> resource states are themselves resources >(especially when named for >>>>>>>> the >>>>>>>> purposes of expressing a derivation), then such derivations can >>>>>>>> equally >>>>>>>> be regarded as relating resources. I think that's more a difference >>>>>>>> of >>>>>>>> terminology than >fundamental. >>>>>>>> >>>>>>>> Would it be fair then to say that in that view resources are >>>>>>>> immutable >>>>>>>> resources? >>>>>>>> >>>>>>> In the case of resources representing a snapshot of state, yes. >>>>>>> >>>>>>> >>>>>>>> Which bring me to the question, do we want to express derivations >>>>>>>> between mutable resources, or that is just something that we should >>>>>>>> avoid at this point? >>>>>>>> >>>>>>> (I'm finishing this email after today's telecon, so it's a bit of a >>>>>>> re-run.) >>>>>>> >>>>>>> I think that many of our use-cases are based on invariant values, and >>>>>>> the >>>>>>> near-term goal is to find expression for these. So we definitely do >>>>>>> want to >>>>>>> express derivations between non-varying values. But in so doing, it's >>>>>>> not clear >>>>>>> to me (yet) that we need to exclude mutable resources, so I say let's >>>>>>> keep our >>>>>>> options open and not close off any possibilities that we don't have >>>>>>> to. >>>>>>> >>>>>>> So my answer to avoiding mutable resources is: "yes and no". >>>>>>> >>>>>>> #g >>>>>>> -- >>>>>>> >>>>>>> >>>>>>> >>>>>>>> Thanks, khalid >>>>>>>> >>>>>>>> >>>>>>>>> Where I think I may diverge from what you say is that I would not >>>>>>>>> limit such expressions of derivation to resources that happen to be >>>>>>>>> a >>>>>>>>> state (or snapshot of state) of some resource. I think defining >>>>>>>>> that >>>>>>>>> distinction in a hard-and-fast way, that also aligns with various >>>>>>>>> intuitions we may have about derivation, may prove difficult to >>>>>>>>> achieve (e.g. as I think is suggested by Jim Meyers in >>>>>>>>> http://lists.w3.org/Archives/Public/public-prov-wg/2011Jun/0015.html >>>>>>>>> (*)). >>>>>>>>> >>>>>>>>> #g >>>>>>>>> -- >>>>>>>>> >>>>>>>>> (*) I just love the W3C mailing list archives - so easy to find >>>>>>>>> links >>>>>>>>> to messages, and thus capture provenance! >>>>>>>>> >>>>>>>>> Khalid Belhajjame wrote: >>>>>>>>> >>>>>>>>>> Hi, >>>>>>>>>> >>>>>>>>>> From the discussion so far on derivation it seems that most people >>>>>>>>>> tend to define derivation between resource states or resources >>>>>>>>>> state >>>>>>>>>> representations, but not for resources. >>>>>>>>>> >>>>>>>>>> My take on this is that in a context where a resource is mutable, >>>>>>>>>> derivations will mainly be used to associate resource states and >>>>>>>>>> resource states representations. >>>>>>>>>> >>>>>>>>>> That said, based on derivations connecting resource states and >>>>>>>>>> resources state representations, one can infer new derivations >>>>>>>>>> between resources. For example, consider the resource r_1 and the >>>>>>>>>> associated resource state r_1_s, and consider that r_1_s was used >>>>>>>>>> to >>>>>>>>>> construct a new resource state r_2_s, actually the first state, of >>>>>>>>>> the resource r2. We can state that r_2_s is derived from r_1_s, >>>>>>>>>> i.e., >>>>>>>>>> r_1_s -> r_2_s. We can also state that the resource r_2 is derived >>>>>>>>>> from the resource r_1, i.e., r_1 -> r_2 >>>>>>>>>> >>>>>>>>>> PS: I added a defintiion of derivation within this lines to the >>>>>>>>>> wiki: >>>>>>>>>> http://www.w3.org/2011/prov/wiki/ConceptDerivation >>>>>>>>>> >>>>>>>>>> Thanks, khalid >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> >>>>>>>>>> On 01/06/2011 07:49, Luc Moreau wrote: >>>>>>>>>> >>>>>>>>>>> Hi Graham, >>>>>>>>>>> >>>>>>>>>>> Isn't it that you used the duri scheme to name the two resource >>>>>>>>>>> states that exist in >>>>>>>>>>> this scenario? >>>>>>>>>>> >>>>>>>>>>> In your view of the web, is there a notion of stateful resource? >>>>>>>>>>> Does it apply here? >>>>>>>>>>> >>>>>>>>>>> Thanks, >>>>>>>>>>> Luc >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> >>>>>>>>>>> On 31/05/11 23:57, Graham Klyne wrote: >>>>>>>>>>> >>>>>>>>>>>> Luc Moreau wrote: >>>>>>>>>>>> >>>>>>>>>>>>> Graham, >>>>>>>>>>>>> >>>>>>>>>>>>> In my example, I really mean for the two versions of the chart >>>>>>>>>>>>> to >>>>>>>>>>>>> be available at >>>>>>>>>>>>> the same URI. (So, definitely, an uncool URI!) >>>>>>>>>>>>> >>>>>>>>>>>>> In that case, there is a *single* resource, but it is stateful. >>>>>>>>>>>>> Hence, there >>>>>>>>>>>>> are two *resource states*, one generated using (stats2), and the >>>>>>>>>>>>> other using (stats3). >>>>>>>>>>>>> >>>>>>>>>>>> Luc, >>>>>>>>>>>> >>>>>>>>>>>> I had interpreted your scenario as using a common URI as you >>>>>>>>>>>> explain. >>>>>>>>>>>> >>>>>>>>>>>> But there are still several resources here, but they are not all >>>>>>>>>>>> exposed on the web or assigned URIs. I'm appealing here to >>>>>>>>>>>> anything that *might* be identified as opposed to things that >>>>>>>>>>>> actually are assigned URIs. (For example, the proposed duri: >>>>>>>>>>>> scheme might be used - >>>>>>>>>>>> http://tools.ietf.org/id/draft-masinter-dated-uri-07.html) >>>>>>>>>>>> >>>>>>>>>>>> (And the URI is perfectly "cool" if it is specifically intended >>>>>>>>>>>> to >>>>>>>>>>>> denote a dynamic resource. A URI used to access the current >>>>>>>>>>>> weather in London can be stable if properly managed.) >>>>>>>>>>>> >>>>>>>>>>>> (I think this is all entirely consistent with my earlier stated >>>>>>>>>>>> positions.) >>>>>>>>>>>> >>>>>>>>>>>> #g >>>>>>>>>>>> -- >>>>>>>>>>>> >>>>>>>>>>>> >>>>>>>>>>>>> Of course, if blogger had used cool uris, then, c2s2 and c2s3 >>>>>>>>>>>>> would be different resources. >>>>>>>>>>>>> >>>>>>>>>>>>> Luc >>>>>>>>>>>>> >>>>>>>>>>>>> On 05/31/2011 02:25 PM, Graham Klyne wrote: >>>>>>>>>>>>> >>>>>>>>>>>>>> I see (at least) two resources associated with (c2): one >>>>>>>>>>>>>> generated using (stats2), and other using (stats3). We might >>>>>>>>>>>>>> call these (c2s2) and (c2s3). >>>>>>>>>>>>>> >>>>>>> >>>>>>> >>>>>> -- >>>>>> Professor Luc Moreau Electronics and Computer Science >>>>>> tel: +44 23 8059 4487 University of Southampton fax: >>>>>> +44 23 8059 2865 Southampton SO17 1BJ email: >>>>>> l.moreau@ecs.soton.ac.uk <mailto:l.moreau@ecs.soton.ac.uk> United Kingdom >>>>>> http://www.ecs.soton.ac.uk/~lavm >>>> >>>> ______________________________________________________________________ >>>> This email has been scanned by the MessageLabs Email Security System. >>>> For more information please visit http://www.messagelabs.com/email >>>> ______________________________________________________________________ >>>> >>> >>> >>> >> >> >> > > > ______________________________________________________________________ > This email has been scanned by the MessageLabs Email Security System. > For more information please visit http://www.messagelabs.com/email > ______________________________________________________________________ > -- Dr Simon Miles Lecturer, Department of Informatics Kings College London, WC2R 2LS, UK +44 (0)20 7848 1166
Received on Tuesday, 7 June 2011 19:46:35 UTC