Re: PROV-ISSUE-1 (define-resource): Definition for concept 'Resource' [Provenance Terminology]


I think it makes sense, mostly...

In the first instance, I think we need to define something that works when any 
permitted variations are such that they don't affect the veracity of the 
provenance data under consideration (roughly what I mean by "invariant").

Beyond that, and in this I am speculating, I imagine that we would expect to 
have different identifiers (to avoid paradox) with some additional stated 
relationships (to provide an appropriate level of connection).  Details TBD.

Crucially, I would expect that the simple "low level" accounts (per your 
description) would apply across a range of identified resources, specifically to 
something that looks like the root of a class hierarchy (I'm fudging a 
class/instance distinction here), and that refinements of such a simple account 
would apply to successively more constrained versions of the common case.

So, roughly, when we find a paradox, we would need to "fork" the case under 
consideration to separate the conflicting assertions.

This is all very speculative:  I wouldn't want to claim applicability for such 
an approach until it is proven to work in credible scenarios.

The last sentence of your description confuses me: "small bits of the general 
case will sneak into an account".  I would expect the entirety of the "general 
case" to be a valid part of all accounts.  That's what I understand by "general 
case" (which may or may not be equivalent to the "simple case" from which the 
analysis starts).  Maybe you mean something else?

Is this all starting to sound like a version of the frame problem?


Myers, Jim wrote:
> They could be separate accounts, but that leads to two questions - 
> Do the legal and physical witnesses use the same identifiers in their
> accounts? If so, we have a paradox/the witnesses appear to disagree. If
> not, we can't connect the accounts.
> If I'm a provenance aggregator (or we have a judge who would like to
> read both accounts and make some claim about what the overall truth is),
> how can I represent things in a way that shows that there is no paradox
> in this case (that the legal and physical objects diverge over time)? 
> I would expect that in simple/low-level accounts, we'll often be talking
> about only one perspective (or will think we are) and thus the world
> looks relatively simple - some mutable thing has states that go through
> processes to produce new states, but thinking of mutable things that are
> related to 'things that hold some aspects of state constant' (legal
> state, physical state, some physical state but not other parts of
> physical state) allows the more general case. And, when we think we're
> in the simple case but we find a paradox, or want to talk about agents
> or sources around the edges of our main provenance trail, small bits of
> the general case will sneak into an account.
> Does that make sense?
>  Jim
>> -----Original Message-----
>> From: [mailto:public-prov-wg-
>>] On Behalf Of Paolo Missier
>> Sent: Monday, June 06, 2011 10:26 AM
>> To:
>> Subject: Re: PROV-ISSUE-1 (define-resource): Definition for concept
> 'Resource'
>> [Provenance Terminology]
>> Jim
>> isn't the legal/physical distinction from your example an instance of
> diferent
>> "accounts" (in OPM terms), i.e., different observers collecting
> different
>> sequences of events, which may partially overlap. This is just to
> clarify, it
>> seems to be we do have concepts to deal with this distinction.
>> Regards, -Paolo
>>   On 6/2/11 3:15 PM, Myers, Jim wrote:
>>> This issue is not just aggregation. I have an iPass and the state
> considers any
>> vehicle with that iPass in it to be mine and expects me to pay tolls
> when it
>> engages in drive-on-tollway events. There are times when that legal
> vehicle is
>> my physical car and times when its a rental. (I.e. I would not say my
> car had
>> all its parts replaced when I use a rental car).
>>> I think we have exactly this issue in our scenarios - the
> legal/logical data
>> from the goverment corresponds to different physical file objects at
> different
>> times and we want to track provenance across the legal/logical and
> physical
>> processes that occur. We are asking in queries whether the logical
> result
>> depends on the government data even when all of the physical bits of
> the
>> input used are completely different (on a different disk, perhaps in a
> different
>> format) than the government data file. If we don't track the logical
>> government data separately from physical files that at times are
>> manifestations of it, we get paradoxes from our limited model that
> don't exist
>> in the real world. (File copying preserves the logical-to-physical
>> correspondence, editing a file to be all zeros does not, so if you
> have a
>> derived result from any copy created before an edit occurred, you're
> result is
>> logically dependent on the government data...).
>>> We really have these types of logical to physical relationships
> throughout
>> science as well - we assume the reading from the sensor is the logical
>> temperature but would question that relationship if we had provenance
> of a
>> 'smashed' event for the sensor. The logical temperature may at times
> have
>> separate provenance from the sensor reading and we may want to track
> both.
>>> In a practical sense, I think modeling this way involves very little
> change to
>> OPM-style provenance. In addition to artifact-process
> execution-artifact type
>> chains, you have the occasional links that connect resources -
> physical file is a
>> manifestation of the gov data - that allow you to cross from thinking
> about
>> legal/logical/intellectual provenance to physical/computational
> provenance,
>> etc. That 'minor' addition would avoid further discussion of how/when
> to
>> categorize specific things as mutable/immutable, etc. and probably
> remove
>> the need for special case opm:agent and pml:source style types as
> well.
>>>   Jim

Received on Monday, 6 June 2011 21:04:14 UTC