Re: PROV-ISSUE-403 (Feedback_TL): Feedback on the mapping from Tim Lebo [Mapping PROV-O to Dublin Core]

Hi Tim,
I have answered your review here:
https://github.com/dcmi/DC-PROV-Mapping/wiki/Tim-lebo
The changes that have to be done to the document are listed here:
https://github.com/dcmi/DC-PROV-Mapping/wiki/Dealing-with-feedback
in the changelog section. I will add them soon.

Best,
Daniel

2012/6/9 Provenance Working Group Issue Tracker <sysbot+tracker@w3.org>

> PROV-ISSUE-403 (Feedback_TL): Feedback on the mapping from Tim Lebo
> [Mapping PROV-O to Dublin Core]
>
> http://www.w3.org/2011/prov/track/issues/403
>
> Raised by: Daniel Garijo
> On product: Mapping PROV-O to Dublin Core
>
> Regarding
> https://github.com/dcmi/DC-PROV-Mapping/wiki/Mapping-Primer#wiki-References
>
> 1)
> "To be more precise, we define provenance metadata as metadata providing
> provenance information according to the definition of the W3C Provenance
> Incubator Group"
>
> Why are you still using the XG's definition? Does PROV-WG still not
> provide one that you like? Should PROV-WG be explicit about their
> definition of provenance (since its materials will become Recommendation
> and XG's will not)?
>
>
> 2)
>
> "For the complex mappings, we take the following approach: "
>
> is confusing. Is one of the "three parts" enumerated above "complex". Ah,
> yes. The third.
>
> Suggest to draw that connection more clearly.
>
> 3)
>
> The points in the second half of the paragraph:
>
> ". A rationale for these two steps is that the mappings in stage 1 are
> context free and do not depend on the existence of any other statements. On
> the other hand, by employing the patterns developed for stage 2, any kind
> of generated PROV data could be cleaned up at a later point, for instance
> after the integration with provenance information from a different source,
> which could be advantageous. "
>
> really should be promoted to the first half of the paragraph. It takes too
> long to determine what the distinction is between the two phases.
>
> 4)
>
> The use of blank nodes is disturbing (
> http://linkeddatabook.com/editions/1.0/#htoc16). Please make it clear
> that the bnodes only exist during the processing that you suggest, and that
> bnodes are not produced in resulting PROV or DC records.
>
> 5)
>
> Direct mappings:
>
>  -1 dct:references rdfs:subPropertyOf prov:wasDerivedFrom .
>  +1 dct:creator rdfs:subPropertyOf prov:wasAttributedTo .
>  +1 dct:rightsHolder rdfs:subPropertyOf prov:wasAttributedTo .
>  -1 (casting a broad to a specific) dct:date rdfs:subPropertyOf
> prov:generatedAtTime .
>  +1 dct:Agent owl:equivalentClass prov:Agent .
>  -1 (reverse these) prov:hadOriginalSource rdfs:subPropertyOf dct:source .
>  +1 prov:wasRevisionOf rdfs:subPropertyOf dct:isVersionOf .
>
> Voting for all of them (in
> https://github.com/dcmi/DC-PROV-Mapping/wiki/Direct-Mappings):
>
>  +1 dct:Agent           owl:equivalentClass   prov:Agent.
>  -1 dct:references      rdfs:subPropertyOf    prov:wasDerivedFrom .
>
>  +1 dct:rightsHolder    rdfs:subPropertyOf    prov:wasAttributedTo .
>  +1 dct:creator         rdfs:subPropertyOf    prov:wasAttributedTo .
>  +1 dct:publisher       rdfs:subPropertyOf    prov:wasAttributedTo .
>  +1 dct:contributor     rdfs:subPropertyOf    prov:wasAttributedTo .
>
>  +1 dct:isVersionOf     rdfs:subPropertyOf    prov:wasDerivedFrom .
>  +1 dct:isFormatOf      rdfs:subPropertyOf    prov:alternateOf .
>  +1 dct:replaces        rdfs:subPropertyOf    prov:tracedTo .
>  +1 dct:source          rdfs:subPropertyOf    prov:wasDerivedFrom .
>
>  -1 dct:date            rdfs:subPropertyOf    prov:generatedAtTime .
>
> I would support reversing the above. As it is, you are casting a general
> "any date you wish" into a very specific meaning.
>
> At first glance, the following are concerning. If the same instance has
> all of these properties, then it was generated at many distinct times.
> Perhaps your complex mappings tease this out.
>
>  -1 dct:issued          rdfs:subPropertyOf    prov:generatedAtTime .
>  -1 dct:dateAccepted    rdfs:subPropertyOf    prov:generatedAtTime .
>  -1 dct:dateCopyRighted rdfs:subPropertyOf    prov:generatedAtTime .
>  -1 dct:dateSubmitted   rdfs:subPropertyOf    prov:generatedAtTime .
>  -1 dct:modified        rdfs:subPropertyOf    prov:generatedAtTime .
>
> The following casts a range into an instant of time.
>
>  -1 dct:valid           rdfs:subPropertyOf    prov:generatedAtTime .
>
>  -1 prov:hadOriginalSource rdfs:subPropertyOf dct:source .
>
> I would support reversing the above. PROV is pointing to a subset of the
> sources that dct:source intends to cite. dct:source is the union of
> hadOriginalSource and any of its derivations (and more, perhaps).
>
>  +1 prov:wasRevisionOf     rdfs:subPropertyOf dct:isVersionOf .
>
>
> 6)
>
> In https://github.com/dcmi/DC-PROV-Mapping/wiki/Mapping-Primer
>
> For readability, I'd reverse the order of these:
>
>  dcprov:CreationActivity rdfs:subClassOf
>    prov:Activity, dcprov:ContributionActivity .
>  dcprov:ContributionActivity rdfs:subClassOf
>    prov:Activity .
>
> 7)
>
> In https://github.com/dcmi/DC-PROV-Mapping/wiki/Mapping-Primer
>
> For readability, I'd reverse the order of these:
>
>  dcprov:CreatorRole rdfs:subClassOf
>    prov:Role, dcprov:ContributorRole .
>  dcprov:ContributorRole rdfs:subClassOf
>    prov:Role .
>
> 8)
>
> If we reapply the SPARQL queries from the complex mappings twice, do we
> get two un-identified blank nodes that should be identified?
> If so, this leads to proliferation of bnodes that should be avoided. If
> the queries are only to be informative, and those bnodes to be
> appropriately named to avoid duplication, then I suggest this be clearly
> stated.
>
> 9)
>
> In https://github.com/dcmi/DC-PROV-Mapping/wiki/Complex-Mappings-S1section "List of dc terms excluded from the mapping",
> I suggest to organize by descriptive vs. provenance metadata. That way I
> can review your categorization more easily, AND focus on only the
> provenance metadata (which is the point of the mapping).
>
> 10)
>
> In https://github.com/dcmi/DC-PROV-Mapping/wiki/Mapping-Primer
>
> No bibliography for (DCMI Usage Board, 2010b) or (DCMI Usage Board, 2010a)
>
> You don't reference the URL http://dublincore.org/documents/dcmi-terms/ ?
>
> 11)
>
> It seems like you could include the content of
> https://github.com/dcmi/DC-PROV-Mapping/wiki/Direct-Mappings and
> https://github.com/dcmi/DC-PROV-Mapping/wiki/Prov-Specializationsdirectly in the "primer" - the redundancy is dissonant.
>
> Why three complex mappings in the primer? Why now fewer?
>
> The organization across 4 pages makes it difficult to determine "what is
> where". I think the content as it is could stand on its own as one document.
>
> 12)
>
> Where is stage 2 of the complex mappings?
>
>
> 13) Are there implementations of your complex mapping?
>
>
>
> 14)
>
> https://github.com/dcmi/DC-PROV-Mapping/wiki/Prov-Specializations
>
> The following order makes more sense to me
>
>  dcprov:PublicationActivity      rdfs:subClassOf     prov:Activity .
>  dcprov:ContributionActivity     rdfs:subClassOf     prov:Activity .
>  dcprov:CreationActivity         rdfs:subClassOf     prov:Activity,
> dcprov:ContributionActivity .
>  dcprov:ContributorRole          rdfs:subClassOf     prov:Role .
>  dcprov:PublisherRole            rdfs:subClassOf     prov:Role .
>  dcprov:CreatorRole              rdfs:subClassOf     prov:Role,
> dcprov:ContributorRole .
>
>
>
> 15)
>
> https://github.com/dcmi/DC-PROV-Mapping/wiki/Prov-Specializations
>
> Are the following used in the complex rules? It would be very nice to show
> which rules each specialization is used in. Similarly, it would be nice to
> group rules by their use of PROV terms, and by "in the where" versus "in
> the construct". A navigation like this would really bring the material
> together nicely.
>
>  dcprov:PublicationActivity      rdfs:subClassOf     prov:Activity .
>  dcprov:ContributionActivity     rdfs:subClassOf     prov:Activity .
>  dcprov:CreationActivity         rdfs:subClassOf     prov:Activity,
> dcprov:ContributionActivity .
>  dcprov:ContributorRole          rdfs:subClassOf     prov:Role .
>  dcprov:PublisherRole            rdfs:subClassOf     prov:Role .
>  dcprov:CreatorRole              rdfs:subClassOf     prov:Role,
> dcprov:ContributorRole .
>
>
> 16)
>
> Is the following a copy paste error (publisher is never mentioned):
>
> https://github.com/dcmi/DC-PROV-Mapping/wiki/Complex-Mappings-S1
>
> Section: dct:publisher
>
>  CONSTRUCT {
>    ?doc a prov:Entity .
>       prov:wasAttributedTo ?ag .
>    _:out a prov:Entity .
>       prov:specializationOf ?doc .
>    ?ag a prov:Agent .
>    _:act a prov:Activity, dcprov:PublicationActivity ;
>       prov:wasAssociatedWith ?ag ;
>       prov:qualifiedAssociation _:assoc .
>    _:assoc a prov:Association ;
>       prov:agent ?ag ;
>       prov:hadRole dcprov:PublisherRole .
>    _:out prov:wasGeneratedBy _:act ;
>       prov:wasAttributedTo ?ag .
>  } WHERE {
>    ?doc dct:creator ?ag .
>  }
>
>
>
> 17)
>
> https://github.com/dcmi/DC-PROV-Mapping/wiki/Complex-Mappings-S1
>
> spacing is off in:
>
>
>  dct:rightsHolder
>
>  The rightsHolder is different, here we propose to omit the activity and
> just add the rights holder to the entity by means of
>  prov:wasAttributedTo. This mapping could actually be omitted as the
> statements can be inferred from the direct mapping.
>
>  CONSTRUCT {
>  ?doc     a                         prov:Entity .
>  ?ag       a                         prov:Agent .
>  ?doc     prov:wasAttributedTo      ?ag .
>  } WHERE {
>  ?doc dct:rightsHolder?ag .
>  }
>
>
> 18)
>
> https://github.com/dcmi/DC-PROV-Mapping/wiki/Complex-Mappings-S1
>
> Recommend expanding variable names to be more readable (e.g., ?ag to
> ?agent)
>
> 19)
>
> https://github.com/dcmi/DC-PROV-Mapping/wiki/Complex-Mappings-S1
>
> Is there a reason why you use "_:iss_entity" instead of just the "[]"
> syntax? smearing a node across the CONSTRUCT makes it more difficult to
> read. You used the "[]" in :
>
>
> dct:modified
>
>  [ a prov:Generation ;
>                                                 prov:atTime ?date  ;
>                                                 prov:activity _:act . ] .
>
>
>
>

Received on Wednesday, 4 July 2012 16:09:33 UTC