Re: Late review of the PROV-DC mapping document

Thanks for your feedback, Antoine!
I'll try to make the changes before thursday (when we have the final vote
for the note).
Best,
Daniel


2013/4/14 Antoine Isaac <aisaac@few.vu.nl>

> Dear Daniel, Kai,
>
> I believe I'm too late for comments. Other stuff has prevented me to work
> on it as promised. But as I was curious to read, I thought I would write
> and and send them around. I'm sorry if this creates some trouble in your
> process.
>
> In fact I did not look very thoroughly into the mappings of section 3.1.
> Time allowing I may send another email later. The mappings look appropriate
> at first sight, though. Most of my comments (listed below) are editorial,
> though some may touch on conceptual issues in the arguments exposed in the
> text.
>
> The only real problems now could be with roles, e.g., prov:Creator,
> prov:Publisher. In section 3.2 they are introduced as classes but in
> section 3.3 they are used as instances of classes. And in section 3.4 it is
> mixed: the Turtle example has an instance but Fig. 3 has prov:Creator as a
> class with an instance which is not mentioned in the Turtle example. What
> is your choice? How does PROV handles roles?
>
>
> A last general note/disclaimer, I have to say that I will not apply the
> mappings soon myself, especially not the complex ones (with a 1:15
> multiplier ratio between the input triples and the output triples in
> section 3.3, the clean-up in section 3.4 is a much welcome suggestion!). To
> some extent I am reading the document now not because I plan to push
> implementation of all its receipes in Europeana of elsewhere, but because
> it is a good introduction to PROV for a more traditional metadata
> community. And this is far from a little achievement. Well done!
>
> Best
>
> Antoine
>
> =====
>
>
> - Abstract: please spell out the URI of the "here" hyperlinks. Or create a
> specific paragraph in the intro that does it, and point to this paragraph
> from the abstract.
>
> - Status of document: remove "(to be published as X)" or "(Proposed
> recommendation)" from the listing of PROV documents. In fact I'd suggest
> just to make a reference to PROV-OVERVIEW and do a much welcome shortening
> of the section.
>
> - ToC - Structure of the document: the document misses an "Appendices"
> section to wrap A and B together apart from the other (numbered) sections.
>
> - ToC - Structure of the document: I see no reason why there is a B1. The
> notion of "informative references" is maybe not very useful in a note such
> as yours!
>
> - Use of "Dublin Core", "DC", "dc": a couple of occurrence of "Dublin
> Core" occur after you've started using the abbreviation in a systematic
> way. Homogenization would be good! Also, there's (at least) one "dc" in
> courrier font in 2.2.
>
>
> ========= Section 2.1
>
> - The word "affected" in the first paragraph (E.g., it hints that a
> resource can be "affected in the past") does not mean much to me, as a
> non-native speaker. .
>
> - the paragraph on "Descriptive terms" mentions 30 terms for that
> category. Table 2 has 29.
>
> - Perhaps a similar issue as above, for "derivation". The elements for
> rights, which are often related to access and consumption, seem to have a
> broader scope than what I understand to be "derivation". It's as if you are
> trying to shoehorn rights into this category. I'm not convinced, and I
> can't see much benefit in trying this anyway. There could just be an extra
> category. As a matter of fact I would find this in line with the fact that
> all rights-related properties have naturally found their place in Table 6
> of the rejected properties. For some it is so obvious that you have
> (rightly) not written a reason for rejecting them!
>
> - Table 2. My printout did not print the expected "What" in the first line
> (it could be a bug on my side).
>
>
> ========= Section 2.2
>
> - example 1: I'd recommend using more meaningful URIs for the document
> versions, e.g. ex:prov-dc-20130312 and ex:prov-dc-20121211.
>
> - "relates to the different states that the document had". My gut reading
> of this sentence was that it was about versions only, which is too
> restrictive (there's more at stake than logical versions of a doc) and
> unconvincing (if the aim was to capture versions only, dct:replaces would
> be quite enough). Perhaps replace by "relates to the different stages the
> document underwent" or something grammatically correct than this.
>
> - "involves two different states of the document: the document before it
> was issued and the issued document". To many readers in the DC community,
> there will be just one document before and after issuing; it does not
> really change. Perhaps removing "document" from the second part ("states of
> the document: /before/ and /after/ publication) will help not discouraging
> them. This can also make the sentence more coherent (the object is "states"
> in the first part and "the document" in the second).
> It looks like nitpicking, but I fear there's a real risk of losing a part
> of your core audience here.
>
> - Figure 1: if the graph convention used is the one used throughout all
> PROV documents, it may be useful to mention. It looks very ad-hoc,
> otherwise.
>
> - Approach 2: I don't buy the argument that the pattern "implies that
> ex:doc1 was generated by _:activity and then used by _:activity
> afterwards". Is there some specific semantics to activities' properties,
> which I'm missing?
> As I understand it, Approach 1 does not imply that _:resulting_entity was
> generated by _:activity and then _:used_entity was used by _:activity
> afterwards", which is the exact transposition of your interpretation in
> Approach 2.
>
> - Fig.2 whether I'm right or wrong on the above issue, you can remove "(as
> it implies[...]activity)" from the caption. It doesn't really belong there.
>
> - I thought (from the previous version of the PROV-DC document) that the
> most important argument against Approach 2 was that PROV discouraged a same
> resource to be used as the input and the output of an activity at a same
> time. Has it changed? Personally I didn't like that PROV rule, but in the
> context of a DC-PROV mapping this was a very powerful argument...
>
>
> ========== Section 3.1
>
> - first paragraph move "(i.e. they will be able to understand DC
> statements)" just after "to interoperate with these DC statements". the
> bracketed sentence doesn't really explain "reasoning" per se, it rather
> tries to explain interoperability.
> And is "by applying means of OWL 2" really grammatical a construction?
>
> - Table 3: finding dct:Agent here comes a bit as a surprise, as the class
> has not been introduced before (e.g. in Table 2). Perhaps it could be
> presented aside.
>
> - Table 3 is really big and has a lot of white space. Maybe removing the
> namespace prefixes (which do not bear much info anyway, given what the
> columns include) would allow to trim the first three columns.
>
> - Please keep in Table 3 the order defined in Table 3! The current
> mismatch makes comparison difficult, and for no real reason it seems.
>
> - "This is valid since from the PROV point of view" and the rest of the
> paragraph should be tightened. In the RDF graph that results from example
> 1, there is a prov:Entity with two prov:generatedAtTime statements. Is it
> valid or not? The paragraph currently hint both (it is valid, but does not
> comply to PROV constraints), which is confusing.
>
> - Table 5 has a confusing introduction: what is its rationale as a
> separate table? The fact that it's mapping to inverse relationships, or the
> fact that it's mapping to outside the core of PROV?
>
>
> ========== Section 3.2
>
> My personal taste would be to remove the somewhat redundant prov:Activity
> and prov:Role from the rdfs:subclassesOf prov:Create and prov:Creator.
>
> You could replace "refinements of the properties have been omitted" by
> "refinements of the properties are not needed". The latter is stronger, and
> still true!
>
>
> ========== Section 3.3
>
> I don't understand why replacement is presented as the result of a "search
> and replace". There's no "search" implied as default in a dct:replaces
> link, isn't it?
>
>
> ========== Section 3.4
>
> The notion of "complement" is unclear. Rather than "certain properties
> complement each other" couldn't we have "certain properties indicate a same
> activity"?
> I am not sure also that dct:modified and dct:contributor are so connected.
> A contributor can be involved in the creation of the document, I believe.
>
>
> ========== Section 3.5
>
> Table 6: it is confusing to find here the elements that Table 1 lists as
> relevant for provenance (who when how) and the descriptive metadata
> elements. The table would benefit from the descriptive ones to be removed,
> especially the one for which it is absolutely no surprise that they
> shouldn't be mapped. Or at least the categories should be separated in
> different tables...
> Splitting the tables (in all 4 categories, in fact) would also allow to
> get rid of the second column, which consumes a lot of space for pretty much
> nothing.
> It would also help comparisons. Table 2 has 29 "descriptive metadata
> element", Table 6 has 28. With the order being different, and the size so
> big, I won't make the effort to know which element has been left out.
>
> dct:isRequiredBy line has a type ("reosource")
>
> ##############################**##############################**
> ############
>
> To unsubscribe from the DC-PROVENANCE list, click the following link:
> https://www.jiscmail.ac.uk/**cgi-bin/webadmin?SUBED1=DC-**PROVENANCE&A=1<https://www.jiscmail.ac.uk/cgi-bin/webadmin?SUBED1=DC-PROVENANCE&A=1>
>

Received on Monday, 15 April 2013 08:08:20 UTC