Re: Provenance and properties of entities in RDF

Hi Graham,
Paul created a section on the Data model for "common relations", which
represent shortcuts
between model concepts to avoid excesive complexity:
http://www.w3.org/TR/2011/WD-prov-dm-20111018/#common-relations.
These shortcuts are not yet available in the ontology, but we plan to
include them as well.

Would it make sense to add creation as a "type" of attribution? (or in this
case, subproperty) I think it fits well for the purpose.
We could even reuse dc:creator and explain how to do the mapping with the
prov-o concepts in case the implementer
wants to add more data about the creation process: role played by the agent,
time, location, etc.

Best,
Daniel

2011/10/20 Graham Klyne <graham.klyne@zoo.ox.ac.uk>

> I was aiming to draft something about entity attributes to flesh out the
> relationship between resources and provenance in PAQ, but ran into some
> devil-in-detail problems, which I'm trying to explore here...
>
> This message arises in part from an off-line discussion with Stian, whom I
> thank for pointing me to important information about how the provenance
> model plays out in RDF, and for explaining some of the implications of this.
>  However, any errors and misapprehensions in what follows are all mine.
>
>
> == Provenance and properties of entities in RDF ==
>
> I've been looking at how provenance expressions may be represented in RDF,
> and how such representation interacts with attributes of an entity.
>
> For the purpose of this discussion, I'll use a statement using
> dcterms:creator as an example:
>
>  ex:aDocument a prov:Entity ;
>    dcterms:creator "Meritorious Meerkat" .
>
> The RDF statement with property dcterms:creator can be interpreted as an
> attribute of the entity, *and* as an expression of provenance about the
> entity.
>
> To express the above as provenance using the provenance vocabulary as
> currently defined, we need to introduce a new class, a subclass of
> prov:ProcessExecution; e.g.
>
>  ex:DocumentCreation rdfs:subclassOf prov:ProcessExecution .
>  ex:aDocument a prov:Entity ;
>    prov:wasGeneratedBy
>      [ a ex:DocumentCreation ;
>        prov:wasControlledBy
>          [ a prov:Agent ;
>            foaf:name "Meritorious Meerkat"
>          ]
>      ] .
>
> I observe:
> (a) this structure is quite similar to the sort of event-mediated
> structures that occur when using CIDOC-CRM [1].
> (b) the structure is quite complex compared with the original example.
>
> [1] http://www.cidoc-crm.org/docs/**fin-paper.pdf<http://www.cidoc-crm.org/docs/fin-paper.pdf>
>
> I'm not saying these are problems, but I am trying to explore the landscape
> from an implementer's perspective.
>
> I think it is probably reasonable that applications with a special interest
> in generating and/or consuming provenance information - workflow enactment
> systems come to mind - may reasonably generate and work with the more
> complex format (though my experience with using CIDOC-CRM in RDF suggests
> that some additional steps may be needed if processing of this data is to
> scale - but I don't see that as a primary concern at this juncture).
>
> My main concerns are that we also want to be able to capture and use
> provenance information that is generated incidentally by applications that
> don't have a primary interest in provenance, and the provenance information
> should similarly be accessible to applications that don't care for the
> intricacies of provenance information.  Such applications would easily
> generate and consume statements like the original using dcterms:creator, but
> may be less able to deal with the more complex provenance vocabulary
> structures.
>
> In my mind, this raises the following questions:
>
> (1) is the full complexity of the current provenance model structure
> actually needed?  I think it probably is, but I feel it's worth reflecting
> and asking the question.
>
> (2) should we look to technical mechanisms to define the relationship
> between the simple provenance-as-attributes and fully-modeled provenance
> statements? (E.g., relating the two examples given above.)
>
> (3) rather than defining an all-new vocabulary, should we consider basing
> the mapping of the abstract model to RDF on a subset of the CIDOC-CRM model
> structures?  (I don't think this would affect PROV-DM, but could affect many
> of the terms used in PROV-O, and cause some of the mapped structures in RDF
> to change.)
>
> At the very least, and I think this echos Ivan Herman's recent email to the
> group [2], I think we need to find a way to make it clear how the simple
> attributes can be related to the defined provenance model, and maybe provide
> some guidelines to help provenance-aware applications to interpret and/or
> generate simple attributes that happen to express provenance information.
>
> [2] http://lists.w3.org/Archives/**Public/public-prov-wg/2011Oct/**
> 0140.html<http://lists.w3.org/Archives/Public/public-prov-wg/2011Oct/0140.html>
>
> #g
>
>

Received on Thursday, 20 October 2011 14:13:40 UTC