Re: Questions about provenance BP 6

Hi Bernadette,

See in line answers (short ones, hopefully this will remain readable)


On 3/1/16 6:50 PM, Bernadette Farias Lóscio wrote:  
>     I'm looking at BP6 "Provide data provenance information" at [1]
>
>
>     1. Why are we using prov:wasAttributedTo to say that "The metadata specifies that John created the Bus Timetable dataset."
>     dct:creator would seem a more natural match.
>     This is what is used in Void [2] and many other catalogues.
>     prov:wasAttributedTo is not really wrong, but it feels very strange in this case, were the creation is very clear. In fact prov:wasAttributedTo has very general semantics: "Attribution is the ascribing of an entity to an agent." [3] So by using it instead of dct:creator, one blurs the information a lot.
>
>
> I agree with you that dct:creator is a more natural match. However, the idea with this example was to show how to use PROV to describe provernance. However, the current example is very simple (see next comment) and it doesn't justify the use of PROV.


Well, in fact I think there is a statement with prov:actedOnBehalfOf in the second part of your example, which can be kept I think.
So you can replace 'prov:wasAttributedTo :john;' by 'dct:creator :john', then keep 'prov:actedOnBehalfOf :transport-agency-mycity;', and put both in bold.
Then you have an example that's both simple *and* uses PROV.


>
>
>     2. The prov:wasAttributedTo statement is the only statement that is in bold in the example. This hints that it's the only provenance info in that metadata. But aren't other statements about provenance too? Especially the ones with dct:issued, dct:modified, dct:publisher.
>
>
> Yes, the example can be improved. This was just an initial idea. There an issue about this [4].


You've forgotten to put the link below.


>
>     The BP needs to be careful: I think the sentence
>     [
>     The machine readable version of the data provenance may be provided according to the ontology recommended by W3C to describe provenance information, i.e., the Provenance Ontology
>     ]
>     is too strong. Prov is a great contribution to formalize provenance and created fine-grained statements about it. But something doesn't need to be expressed with the Prov ontology to be classified as provenance, even in the W3C context.
>
>
> I agree that PROV is not the only way to express provenance. In fact, we didn't want to say that PROV should or must be used. In the example section, we mention that PROV may be used and this is part of the example. If you think that this can lead to wrong interpretations, then we can rephrase or change the example.
>

I think it's just about re-phrasing the sentence. Here's a suggestion:
[
The machine readable version of the data provenance can be provided using an ontology recommended to describe provenance information, such as W3C's Provenance Ontology
]


>
>
>     Trying to be a bit more concrete: the 'why' part of the BP refers to the fact that users will want to know "the origin or history of the published data.". I think dct:issued, dct:modified, dct:publisher clearly match this need.
>
>
> I agree! We're gonna update the example to include these properties.


Great!

Cheers,

Antoine

>
>   Cheers,
> Bernadette
>
>
>     Antoine
>
>     [1] http://w3c.github.io/dwbp/bp.html#provenance
>     [2] https://www.w3.org/TR/void/#dublin-core
>     [2] https://www.w3.org/TR/prov-o/#wasAttributedTo
>
>
> [4]
>
> --
> Bernadette Farias Lóscio
> Centro de Informática
> Universidade Federal de Pernambuco - UFPE, Brazil
> ----------------------------------------------------------------------------

Received on Thursday, 3 March 2016 00:14:48 UTC