Re: PROV-ISSUE-403 (Feedback_TL): Feedback on the mapping from Tim Lebo [Mapping PROV-O to Dublin Core]

Hi Tim,
I have answered your review here:
The changes that have to be done to the document are listed here:
in the changelog section. I will add them soon.


2012/6/9 Provenance Working Group Issue Tracker <>

> PROV-ISSUE-403 (Feedback_TL): Feedback on the mapping from Tim Lebo
> [Mapping PROV-O to Dublin Core]
> Raised by: Daniel Garijo
> On product: Mapping PROV-O to Dublin Core
> Regarding
> 1)
> "To be more precise, we define provenance metadata as metadata providing
> provenance information according to the definition of the W3C Provenance
> Incubator Group"
> Why are you still using the XG's definition? Does PROV-WG still not
> provide one that you like? Should PROV-WG be explicit about their
> definition of provenance (since its materials will become Recommendation
> and XG's will not)?
> 2)
> "For the complex mappings, we take the following approach: "
> is confusing. Is one of the "three parts" enumerated above "complex". Ah,
> yes. The third.
> Suggest to draw that connection more clearly.
> 3)
> The points in the second half of the paragraph:
> ". A rationale for these two steps is that the mappings in stage 1 are
> context free and do not depend on the existence of any other statements. On
> the other hand, by employing the patterns developed for stage 2, any kind
> of generated PROV data could be cleaned up at a later point, for instance
> after the integration with provenance information from a different source,
> which could be advantageous. "
> really should be promoted to the first half of the paragraph. It takes too
> long to determine what the distinction is between the two phases.
> 4)
> The use of blank nodes is disturbing (
> Please make it clear
> that the bnodes only exist during the processing that you suggest, and that
> bnodes are not produced in resulting PROV or DC records.
> 5)
> Direct mappings:
>  -1 dct:references rdfs:subPropertyOf prov:wasDerivedFrom .
>  +1 dct:creator rdfs:subPropertyOf prov:wasAttributedTo .
>  +1 dct:rightsHolder rdfs:subPropertyOf prov:wasAttributedTo .
>  -1 (casting a broad to a specific) dct:date rdfs:subPropertyOf
> prov:generatedAtTime .
>  +1 dct:Agent owl:equivalentClass prov:Agent .
>  -1 (reverse these) prov:hadOriginalSource rdfs:subPropertyOf dct:source .
>  +1 prov:wasRevisionOf rdfs:subPropertyOf dct:isVersionOf .
> Voting for all of them (in
>  +1 dct:Agent           owl:equivalentClass   prov:Agent.
>  -1 dct:references      rdfs:subPropertyOf    prov:wasDerivedFrom .
>  +1 dct:rightsHolder    rdfs:subPropertyOf    prov:wasAttributedTo .
>  +1 dct:creator         rdfs:subPropertyOf    prov:wasAttributedTo .
>  +1 dct:publisher       rdfs:subPropertyOf    prov:wasAttributedTo .
>  +1 dct:contributor     rdfs:subPropertyOf    prov:wasAttributedTo .
>  +1 dct:isVersionOf     rdfs:subPropertyOf    prov:wasDerivedFrom .
>  +1 dct:isFormatOf      rdfs:subPropertyOf    prov:alternateOf .
>  +1 dct:replaces        rdfs:subPropertyOf    prov:tracedTo .
>  +1 dct:source          rdfs:subPropertyOf    prov:wasDerivedFrom .
>  -1 dct:date            rdfs:subPropertyOf    prov:generatedAtTime .
> I would support reversing the above. As it is, you are casting a general
> "any date you wish" into a very specific meaning.
> At first glance, the following are concerning. If the same instance has
> all of these properties, then it was generated at many distinct times.
> Perhaps your complex mappings tease this out.
>  -1 dct:issued          rdfs:subPropertyOf    prov:generatedAtTime .
>  -1 dct:dateAccepted    rdfs:subPropertyOf    prov:generatedAtTime .
>  -1 dct:dateCopyRighted rdfs:subPropertyOf    prov:generatedAtTime .
>  -1 dct:dateSubmitted   rdfs:subPropertyOf    prov:generatedAtTime .
>  -1 dct:modified        rdfs:subPropertyOf    prov:generatedAtTime .
> The following casts a range into an instant of time.
>  -1 dct:valid           rdfs:subPropertyOf    prov:generatedAtTime .
>  -1 prov:hadOriginalSource rdfs:subPropertyOf dct:source .
> I would support reversing the above. PROV is pointing to a subset of the
> sources that dct:source intends to cite. dct:source is the union of
> hadOriginalSource and any of its derivations (and more, perhaps).
>  +1 prov:wasRevisionOf     rdfs:subPropertyOf dct:isVersionOf .
> 6)
> In
> For readability, I'd reverse the order of these:
>  dcprov:CreationActivity rdfs:subClassOf
>    prov:Activity, dcprov:ContributionActivity .
>  dcprov:ContributionActivity rdfs:subClassOf
>    prov:Activity .
> 7)
> In
> For readability, I'd reverse the order of these:
>  dcprov:CreatorRole rdfs:subClassOf
>    prov:Role, dcprov:ContributorRole .
>  dcprov:ContributorRole rdfs:subClassOf
>    prov:Role .
> 8)
> If we reapply the SPARQL queries from the complex mappings twice, do we
> get two un-identified blank nodes that should be identified?
> If so, this leads to proliferation of bnodes that should be avoided. If
> the queries are only to be informative, and those bnodes to be
> appropriately named to avoid duplication, then I suggest this be clearly
> stated.
> 9)
> In "List of dc terms excluded from the mapping",
> I suggest to organize by descriptive vs. provenance metadata. That way I
> can review your categorization more easily, AND focus on only the
> provenance metadata (which is the point of the mapping).
> 10)
> In
> No bibliography for (DCMI Usage Board, 2010b) or (DCMI Usage Board, 2010a)
> You don't reference the URL ?
> 11)
> It seems like you could include the content of
> and
> in the "primer" - the redundancy is dissonant.
> Why three complex mappings in the primer? Why now fewer?
> The organization across 4 pages makes it difficult to determine "what is
> where". I think the content as it is could stand on its own as one document.
> 12)
> Where is stage 2 of the complex mappings?
> 13) Are there implementations of your complex mapping?
> 14)
> The following order makes more sense to me
>  dcprov:PublicationActivity      rdfs:subClassOf     prov:Activity .
>  dcprov:ContributionActivity     rdfs:subClassOf     prov:Activity .
>  dcprov:CreationActivity         rdfs:subClassOf     prov:Activity,
> dcprov:ContributionActivity .
>  dcprov:ContributorRole          rdfs:subClassOf     prov:Role .
>  dcprov:PublisherRole            rdfs:subClassOf     prov:Role .
>  dcprov:CreatorRole              rdfs:subClassOf     prov:Role,
> dcprov:ContributorRole .
> 15)
> Are the following used in the complex rules? It would be very nice to show
> which rules each specialization is used in. Similarly, it would be nice to
> group rules by their use of PROV terms, and by "in the where" versus "in
> the construct". A navigation like this would really bring the material
> together nicely.
>  dcprov:PublicationActivity      rdfs:subClassOf     prov:Activity .
>  dcprov:ContributionActivity     rdfs:subClassOf     prov:Activity .
>  dcprov:CreationActivity         rdfs:subClassOf     prov:Activity,
> dcprov:ContributionActivity .
>  dcprov:ContributorRole          rdfs:subClassOf     prov:Role .
>  dcprov:PublisherRole            rdfs:subClassOf     prov:Role .
>  dcprov:CreatorRole              rdfs:subClassOf     prov:Role,
> dcprov:ContributorRole .
> 16)
> Is the following a copy paste error (publisher is never mentioned):
> Section: dct:publisher
>    ?doc a prov:Entity .
>       prov:wasAttributedTo ?ag .
>    _:out a prov:Entity .
>       prov:specializationOf ?doc .
>    ?ag a prov:Agent .
>    _:act a prov:Activity, dcprov:PublicationActivity ;
>       prov:wasAssociatedWith ?ag ;
>       prov:qualifiedAssociation _:assoc .
>    _:assoc a prov:Association ;
>       prov:agent ?ag ;
>       prov:hadRole dcprov:PublisherRole .
>    _:out prov:wasGeneratedBy _:act ;
>       prov:wasAttributedTo ?ag .
>  } WHERE {
>    ?doc dct:creator ?ag .
>  }
> 17)
> spacing is off in:
>  dct:rightsHolder
>  The rightsHolder is different, here we propose to omit the activity and
> just add the rights holder to the entity by means of
>  prov:wasAttributedTo. This mapping could actually be omitted as the
> statements can be inferred from the direct mapping.
>  ?doc     a                         prov:Entity .
>  ?ag       a                         prov:Agent .
>  ?doc     prov:wasAttributedTo      ?ag .
>  } WHERE {
>  ?doc dct:rightsHolder?ag .
>  }
> 18)
> Recommend expanding variable names to be more readable (e.g., ?ag to
> ?agent)
> 19)
> Is there a reason why you use "_:iss_entity" instead of just the "[]"
> syntax? smearing a node across the CONSTRUCT makes it more difficult to
> read. You used the "[]" in :
> dct:modified
>  [ a prov:Generation ;
>                                                 prov:atTime ?date  ;
>                                                 prov:activity _:act . ] .

Received on Wednesday, 4 July 2012 16:09:33 UTC