Re: PROV-ISSUE-1 (define-resource): Definition for concept 'Resource' [Provenance Terminology]

Luc,

Responses below...

Luc Moreau wrote:
> Some comments are interleaved.
> 
> On 05/24/2011 10:39 AM, Graham Klyne wrote:
>> Luc Moreau wrote:
>>>
>>> As a follow on to my previous message on resources, the wiki page [1]
>>> also makes the distinction between a resource, a resource state, a
>>> resource state representation.
>>>
>>> What is provenance going to be about?
>>> a. provenance of a resource?
>>> b. provenance of a resource state?
>>> c. provenance of a resource state representation?
>>>
>>> Does it make sense to talk about a? b? c?
>>> If so, what kind of "provenance statement" would be involved in a? b? c?
>>>
>>> Cheers,
>>> Luc
>>>
>>> [1] http://www.w3.org/2011/prov/wiki/Provenance_and_Web_Architecture
>>
>> I think it makes sense to talk about (a) and (c), but not generally 
>> about (b).
>>
>> My rationale for this is that (b) is an intermediate construct that is 
>> used to explain the linkage between (a) and (c), but in general (as 
>> far as I'm aware) has no visibility exposed via Web architecture other 
>> than (c).
>>
>> Example.
>>
>> Suppose we have an RDF file describing zebras:
>>    (r1) http://example.com/zebras.rdf
> 
> Again, it would be useful to clarify what you mean here. Referring to
> http://www.w3.org/2011/rdf-wg/wiki/GraphConceptTerminology,
> do you see r1 as a g-box? g-snap? g-text?

Strictly, none of the above.  Reading the definitions given, I initially thought 
the closest could be g-text.

But what I actually mean is a file accessible on the web at the given URI, which 
when dereferenced returns a g-text.  When dereferenced, the returned g-text 
could be interpreted (per the corresponding RDF format syntax specification) as 
describing a g-snap.  It seems that the URI used would formally denote the 
g-box, which would be the web "resource", of which the g-snap is its state at 
the time the dereferencing is serviced.

Thus another answer to your question might be "all of the above".

(Until we start talking about RDF internal structures and the minutiae of how 
RDF concepts are manipulated and updated, I'm not sure this terminology is 
especially useful.)


>> which contains statements referring to URIs
>>    (r2) http://example.com/zebras/#zebra1
>>    (r3) http://example.com/zebras/#zebra2
>>
>> Provenance information about (r1) would be an instance of (c) - noting 
>> that in this case it would relate to the state of multiple resources.
>>
>> Provenance information about the referents of (r2) and (r3) would be 
>> an instance of (a) - where they were born, who were their parents, etc.
>>
>> (I can imagine situations in which one explicitly constructs and 
>> identifies a notion of resource state (e.g the state of health of a 
>> zebra), but I'd see that as a special case constructed by an 
>> application, not part of the general Web model.  But even in such a 
>> case, what would it mean to refer to the provenance of an animals 
>> state of health, as opposed to a record (representation) of its state 
>> of health?)
>>
> Quoting the graph concept terminology document,
> 
> There may exist a need to make statements about a g-snap,..., thus a way 
> to refer to one may be required
 >
> There may exist a need to make statements about a "state" of a g-box, 
> either previous or future, thus a way to refer to a g-snap as being the 
> state of g-box-X at Y time may be needed.
> 
> It seems this is quite relevant to us.

(I note that these statements are quite speculative, in that they are not 
confirmed requirements.)

Roughly, I think that (echoing your a, b, c labels):

(a) statements about the resource would be statements about the g-box.

(b) statements about the resource state would be statements about the g-snap. 
This is the bit I'm not convinced is generally useful or meaningful w.r.t 
provenance.

(c) statements about state representation would be statements about a data 
resource containing the g-text - there could be multiple instances on different 
systems with different URIs - these instances would then be different resources, 
even if they encompass a common representation.  e.g. the provenenace of a 
g-text retrieved to Joe's computer would be distinct from the provenance of the 
same g-text retrieved to Kylie's computer.  IMO.

Per web architecture, anything about which one may make a statement (such as a 
statement of provenance) must be a resource, and one generally needs its URI to 
publish statements about it on the web.  Following my comments at (c) above, I 
become less sure that there's a separate requirement to make statements about a 
resource state representation, as in order to do that it must be established as 
a distinct resource, which may amount to more than just the sequence of octets 
that constitute the representation.

#g
--

Received on Tuesday, 24 May 2011 12:08:18 UTC