W3C home > Mailing lists > Public > public-prov-wg@w3.org > October 2011

Re: Provenance and properties of entities in RDF

From: Graham Klyne <Graham.Klyne@zoo.ox.ac.uk>
Date: Fri, 21 Oct 2011 09:49:52 +0100
Message-ID: <4EA13230.7070104@zoo.ox.ac.uk>
To: Daniel Garijo <dgarijo@delicias.dia.fi.upm.es>
CC: W3C provenance WG <public-prov-wg@w3.org>
Daniel,

This might be very helpful, but I confess I don't have a clear enough view of 
the entire problem space to offer a clear response on this, right now.

#g
--

On 20/10/2011 15:13, Daniel Garijo wrote:
> Hi Graham,
> Paul created a section on the Data model for "common relations", which
> represent shortcuts
> between model concepts to avoid excesive complexity:
> http://www.w3.org/TR/2011/WD-prov-dm-20111018/#common-relations.
> These shortcuts are not yet available in the ontology, but we plan to
> include them as well.
>
> Would it make sense to add creation as a "type" of attribution? (or in this
> case, subproperty) I think it fits well for the purpose.
> We could even reuse dc:creator and explain how to do the mapping with the
> prov-o concepts in case the implementer
> wants to add more data about the creation process: role played by the agent,
> time, location, etc.
>
> Best,
> Daniel
>
> 2011/10/20 Graham Klyne<graham.klyne@zoo.ox.ac.uk>
>
>> I was aiming to draft something about entity attributes to flesh out the
>> relationship between resources and provenance in PAQ, but ran into some
>> devil-in-detail problems, which I'm trying to explore here...
>>
>> This message arises in part from an off-line discussion with Stian, whom I
>> thank for pointing me to important information about how the provenance
>> model plays out in RDF, and for explaining some of the implications of this.
>>   However, any errors and misapprehensions in what follows are all mine.
>>
>>
>> == Provenance and properties of entities in RDF ==
>>
>> I've been looking at how provenance expressions may be represented in RDF,
>> and how such representation interacts with attributes of an entity.
>>
>> For the purpose of this discussion, I'll use a statement using
>> dcterms:creator as an example:
>>
>>   ex:aDocument a prov:Entity ;
>>     dcterms:creator "Meritorious Meerkat" .
>>
>> The RDF statement with property dcterms:creator can be interpreted as an
>> attribute of the entity, *and* as an expression of provenance about the
>> entity.
>>
>> To express the above as provenance using the provenance vocabulary as
>> currently defined, we need to introduce a new class, a subclass of
>> prov:ProcessExecution; e.g.
>>
>>   ex:DocumentCreation rdfs:subclassOf prov:ProcessExecution .
>>   ex:aDocument a prov:Entity ;
>>     prov:wasGeneratedBy
>>       [ a ex:DocumentCreation ;
>>         prov:wasControlledBy
>>           [ a prov:Agent ;
>>             foaf:name "Meritorious Meerkat"
>>           ]
>>       ] .
>>
>> I observe:
>> (a) this structure is quite similar to the sort of event-mediated
>> structures that occur when using CIDOC-CRM [1].
>> (b) the structure is quite complex compared with the original example.
>>
>> [1] http://www.cidoc-crm.org/docs/**fin-paper.pdf<http://www.cidoc-crm.org/docs/fin-paper.pdf>
>>
>> I'm not saying these are problems, but I am trying to explore the landscape
>> from an implementer's perspective.
>>
>> I think it is probably reasonable that applications with a special interest
>> in generating and/or consuming provenance information - workflow enactment
>> systems come to mind - may reasonably generate and work with the more
>> complex format (though my experience with using CIDOC-CRM in RDF suggests
>> that some additional steps may be needed if processing of this data is to
>> scale - but I don't see that as a primary concern at this juncture).
>>
>> My main concerns are that we also want to be able to capture and use
>> provenance information that is generated incidentally by applications that
>> don't have a primary interest in provenance, and the provenance information
>> should similarly be accessible to applications that don't care for the
>> intricacies of provenance information.  Such applications would easily
>> generate and consume statements like the original using dcterms:creator, but
>> may be less able to deal with the more complex provenance vocabulary
>> structures.
>>
>> In my mind, this raises the following questions:
>>
>> (1) is the full complexity of the current provenance model structure
>> actually needed?  I think it probably is, but I feel it's worth reflecting
>> and asking the question.
>>
>> (2) should we look to technical mechanisms to define the relationship
>> between the simple provenance-as-attributes and fully-modeled provenance
>> statements? (E.g., relating the two examples given above.)
>>
>> (3) rather than defining an all-new vocabulary, should we consider basing
>> the mapping of the abstract model to RDF on a subset of the CIDOC-CRM model
>> structures?  (I don't think this would affect PROV-DM, but could affect many
>> of the terms used in PROV-O, and cause some of the mapped structures in RDF
>> to change.)
>>
>> At the very least, and I think this echos Ivan Herman's recent email to the
>> group [2], I think we need to find a way to make it clear how the simple
>> attributes can be related to the defined provenance model, and maybe provide
>> some guidelines to help provenance-aware applications to interpret and/or
>> generate simple attributes that happen to express provenance information.
>>
>> [2] http://lists.w3.org/Archives/**Public/public-prov-wg/2011Oct/**
>> 0140.html<http://lists.w3.org/Archives/Public/public-prov-wg/2011Oct/0140.html>
>>
>> #g
>>
>>
>
Received on Friday, 21 October 2011 09:39:22 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 26 April 2012 13:06:46 GMT