Re: PROV-ISSUE-1 (define-resource): Definition for concept 'Resource' [Provenance Terminology] from Graham Klyne on 2011-05-24 (public-prov-wg@w3.org from May 2011)

From: Graham Klyne <GK@ninebynine.org>
Date: Tue, 24 May 2011 12:02:58 +0100
To: Luc Moreau <L.Moreau@ecs.soton.ac.uk>
CC: public-prov-wg@w3.org
Message-ID: <4DDB9062.2090703@ninebynine.org>
Hi Luc,

Luc Moreau wrote:
> Hi Graham,
> Responses interleaved.

Ditto...

> On 05/24/2011 10:12 AM, Graham Klyne wrote:
>> Luc Moreau wrote:
>>> Dear all,
>>>
>>> I am pleased to see that some definitions are being uploaded on the 
>>> wiki; in particular, I see definitions of resources, which I would
>>> like to begin debating during the teleconference this week.
>>>
>>> For now, I just use this definition:
>>>
>>>  A resource can be anything that might be identified by URI
>>>
>>> Going back to the Data Journalism example [1], it is not entirely clear
>>> that such a notion of resource encapsulates all the "data entities" that
>>> we find here.    I can see r1 and r2 being resources.
>>>
>>> However, what about f1, which, for instance, could have been generated
>>> by an xslt transform over d1. f1 could be a file on the local file 
>>> system, which then
>>> is made available later as a resource r1.
>>>
>>> Likewise, lcp1 is a local copy of a serialization of r1.  Again, lcp1 
>>> could be a file
>>> on the file system.
>>>
>>> Are lcp1 and f1 resources?
>>
>> > [1] http://www.w3.org/2011/prov/wiki/ProvenanceExample
>>
>> My default answers would be "Yes" and "Yes", in that they might indeed 
>> be identified by URIs, even if they are not generally accessible using 
>> those URIs (e.g. lcp1 being a local copy).
>>
>> Less clear to me, looking at the example, is whether f1 ("published 
>> RDF data") is the same as r1 ("rdf data available as a web 
>> resource").  Reading the description, I would be inclined to say that 
>> are the same resource, but I can see some scope for differing 
>> interpretations depending on the intent here.
> 
> Joe Blog may have /home/joe/lcp1.rdf on his file system, and Joe Blog2 
> may also have a different file /home/joe/lcp1.rdf
> at the same location on his own computer. So the file path does not 
> identify the resource. Would you force the minting
> of new unique URIs for every local resource?

I would not.  The definition says *might* be identified by a URI.  In in these 
cases, I think the possibility of such identification exists, even if no URI has 
actually been minted.  And any such URI that might be used is not necessarily 
related to the file system path.  For example, they could be a urn:uuid: URIs 
(http://www.ietf.org/rfc/rfc4122.txt).

A related example is real numbers, including non-rationals.  These are sometimes 
treated as resources, even though it is not possible to mint a (finite) URI for 
every such number.  But it *is* possible to mint a unique URI for any particular 
number that one chooses to identify for some purpose.

> 
> Also, I would suggest to refer to the emerging RDF terminology:
> http://www.w3.org/2011/rdf-wg/wiki/GraphConceptTerminology
> 
> 
> lcp1 seems to be a g-text (a turtle serialization).

I don't think there's a need or purpose to invoke that terminology here.

Just consider, for the sake of discussion, a slight revision of the example:

government (gov) converts data (d1) to XML (f1) at time (t1)
government (gov) generates provenance information (prov) regarding XML (f1)
government (gov) publishes XML data (f1) along with its provenance (prov) on a 
portal with a license (li1); the XML data is now available as a Web resource (r1)
  :

I think the example makes just as much sense with RDF replaced by XML, but the 
RDF terminology does not apply to XML data.  And, by the way, I think this 
revised example also represents a use-case that we MUST be able to support 
(except that instead of talking about Turle and RDF/XML serializations, we might 
talk about text/XML vs EXI (http://www.w3.org/TR/2011/REC-exi-20110310/) 
serializations.

>>
>>> Can we classify all the "data entities" in group, with same properties?
>>> What are these classes?
>>
>> Do you mean formally classify in an ontological sense, or informally 
>> as in indicating the intent and differences between the labelled 
>> entities?  The latter might be illuminating, but I sense the 
>> possibility of a tar pit here if we push too hard.
> 
> I meant informally at this stage.

Good.

#g
--


>> The AWWW discusses a subclass of resources called "information 
>> resources", but this distinction isn't universally appreciated.  It 
>> has been used as part of the long-running "http-range-14" discussion 
>> [2].  I think this distinction, in some form, might be relevant to the 
>> example.
>>
>> [2] http://www.w3.org/2001/tag/issues.html#httpRange-14
>>
>> #g
>> -- 
>>
>>
> Luc
>
Received on Tuesday, 24 May 2011 11:17:14 UTC