PROV-ISSUE-1 (define-resource): Definition for concept 'Resource' [Provenance Terminology]


ok, so far I have been able to catch up with the Resource ISSUE 1, and believe me it took a while.  I have not digested anything else.
It seems that some consensus is emerging, so let me throw in my 2 cents as a summary of my understanding +more questions, hoping not
to undo progress that has been made on this.

I seem to see a consensus that resources have, or can be given, an identity:

   >   - For our purposes, a resource is anything which can be referred to  (SM)

there is also a discussion on whether an Information Object has the same resource status as a resource as a physical object, but I
wouldn't be able to add to that discussion. To me, the objects that matter are primarily data structures,  documents, and
assertions, and I think what we are saying does apply to those.

I also agree with SM, GK, etc. that

   >   - When we talk about the provenance of a resource, we mean the
   >  provenance of its state on asking the question.

so we also agree that there is an implicit notion of resource state:

- resource state ->  r-snapshot   (LM)

and I personally agree that any notion of provenance refers to a specific state of a resource.  Naturally here we mean "observable
state". I have not seen the notion of observer introduced in this discussion (I have yet to catch up with the others!), but it seems
natural that provenance is relative to an observer.

- the fact that the Web architecture defines its foundational concepts similarly should be viewed as a convenience which will help
ground the concepts, rather than a set of constraints that we are bound to.


- can we also assume that provenance is /monotonic/ wrt the state evolution of the resource it refers to.
This is desirable (for computational purposes) and seems to follow naturally from associating provenance to a state: let r_s be a
resource r in state s. Its provenance prov(r_s) is a subset of prov(r_{s'}) for any s' that temporally follows s. yes?

- Given a resource r in a state s: r_s, one can create one or more representations ("manifestations") repr(r_s) of r_s. These are
all r-snapshots or r_s.

- importantly, Jun writes:

   >  If f1 is a file, then it is a representation of a resource, not a
   >  resource any more, right?

I would argue that  repr(r_s) *should be a resource itself*, for any resource r and (visible) state s. Indeed, it has an initial
state (the time it is created from the underlying resource state), and its provenance at that state is simply the provenance of r_s,
plus the action of creating repr(r_s).   Then It can then evolve independently (but monotonically) as that new representation is
acted upon. The provenance of any further state, is prefixed by that just mentioned by monotonicity.


I do have a problem with "containers" as a separate notion from resource, however.
Isn't a database a container? and a resource? (it does have a state, which is the set of all its elements, and for a given state I
can certainly exhibit the provenance of each data item it contains).

So I am not sure the notion of container is useful here, or even well-founded:  you end up with issues of granularity, because
containers may be nested. But then anything non-atomic, like a tuple, is a container, which however does have a provenance, as we know.

Oh, well. Just more noise, perhaps.


-----------  ~oo~  --------------
Paolo Missier -,
School of Computing Science, Newcastle University,  UK

Received on Wednesday, 1 June 2011 19:38:47 UTC