Pre-last-call review of PROV-O

All,

I've just read through PROV-O, and generally I like the overall structure and 
approach.  There are a few (editorial) points you might wish to consider in 
preparing for last call:

(1) would it be possible for term names to be included in the table of contents? 
  I found some aspects the document could be difficult to 
navigate/cross-reference in printed form.

(2) is there a defined correspondence between the ontology terms and PROV-DM? 
Maybe a table as an appendix?

(3) In sections 4.1/4.2, I found a few descriptions that look like cut-and-paste 
leftovers:  prov:wasAttributedTo, prov:wasDerivedFrom, prov:alternateOf, 
prov:generatedAtTime, prov:hadmember.

(4) In section 4.3, there are some repeated headings+content that seem, well, 
pointless:
    prov-dm: prov-dm
    prov-n: prov-n
    prov-constraints: prov-constraints
I assume this is an artefact of the document generation process.


I have some more detailed comments that may be considered before or after last call:

1. Introduction.

"defines the normative OWL 2 ontology" - I think it's the definition that's 
normative, not the ontology.  Suggest drop the word "normative"

"PROV-O conforms to the OWL-RL profile" - I think a citation fort OWL-RL would 
be appropriate here.

"Starting point classes and properties" - I note that elsewhere we now refer to 
these as "Core structures".  I think "starting point" is OK, but it might be 
helpful to readers to align usage?

3.1 Starting point terms

para 4: "to provide some ordering information" - I found the choice of term 
"ordering" a little strange - isn't what is being described here "dependency" 
information?

papas 4,5:  references to "not interesting" - I'm thinking that the 
activities/entities concerned may be just "not known".

3.2 Expanded terms

Para 5:  Staring from "It is important to note that...".  I found this very hard 
to understand, potentially contradictory, and I'm not sure if I entirely agree 
with it.  I would suggest either dropping this, or radically simplifying the 
paragraph; e.g.
[[
A prov:Bundle is a named set of provenance descriptions, which may itself have 
provenance. The named provenance descriptions may be expressed as PROV-O or in 
some other form.
]]

Para 9 ("The fourth category..."):  why the two exceptions prov:generated and 
prov:invalidated to the not-defining-inverse principle? (The reason given 
doesn't seem adequate to me.)

Para 13: ("However, immediately after ..."):  typo "inmediately"

Para 14: ("Shortly after Derek's..."):  the use here of "ex:monica" as a noun, 
so closely following use of "Derek" seems, at best, inconsistent.  Suggest "Monica".

3.3: Qualified terms

I appreciate this is a tricky topic to explain, especially all in words, but I 
found it really hard going, even though the basic idea is quite easy to 
understand.  I think a diagram immediately following the first paragraph (along 
the lines of those that appear later in fig 3) to illustrate the idea would make 
a great difference.

Also, I think the choice of example didn't held, as it's not so easy to remember 
that "Association" is between an agent and an activity - there are no clues in 
the choice of word.  By comparison, "Usage" as relating an activity and an 
entity is much easier to remember, so I think an example based on that would be 
easier to follow.

Para 2: (immediately following table 2):  This was hard to follow, and seems to 
me to be in the  wrong place as it's quite divorced from the document parts it 
actually describes.  I think it would be more useful to introduce this (with a 
back-ref to this section) at the start of section 4 where it's much closer to 
the occurrences of what it describes.

Para 5: (following second example):  I find the use of normative language here 
to be inappropriate, and verges on dictating application design.  I argue that 
it is quite correct and acceptable for an implementer to use qualified or 
unqualified forms as they choose, and that a consuming application should be 
prepared to recognize either form.  I think it is reasonable to explain the 
consequences of using the available choices, and suggesting an approach to 
encourage consistency, but I wouldn't go beyond that.  Suggest that SHOULD and 
SHOULD NOT here be replaced by non-normative, more descriptive language.

If there's anything that I think should be normative here, it is that consuming 
applications SHOULD recognize both qualified and unqualified forms, and treat 
the qualified form as implying the unqualified form.  (I think this would be the 
most effective invocation of the Postel principle.)

4.2 Expanded terms

Class prov:Bundle:  is the notion of provenance *of an entity* actually 
described anywhere?  (As opposed to provenance that is "about entities, 
activities and people".

This is maybe a bit picky, and is a wider issue than just prov:Bundle, but it's 
the phrasing here that made me ask the question.  I wonder if it wouldn't help 
to be clear that we all understand the same thing for "provenance of an entity".

Further down, the paragraph starting "Note that there are kinds of accounts...". 
  It seems to me this is a statement of what should already be entirely obvious 
- I think it has greater potential to confuse than clarify.  Suggest dropping this.


Class prov:Organization:  "social institutions" - I think is either tautological 
or incorrect, depending on how one understand "social" here (in the current 
climate, I regard many companies as being distinctly *anti*-social).  Suggest 
drop "social" here - I think it's unnecessary.


Property prov:hadPrimarySource:  Text further down refers to "An original 
source..." - I think this should (now) be "A primary source...".


Property prov:invalidatedAtTime:  "...began to be invalidated" reads oddly to 
me.  Suggest just "...was invalidated" (as once the invalidation process has 
started, the essential conditions of being invalidated are already satisfied).


Property prov:value:  I suspect this has already been extensively discussed, but 
I wanted to question this being a *functional* property.  That would exclude, 
for example, giving equivalent integer and floating values for an entity.

But, maybe more fundamentally, is there any specified way to express a value 
that is itself denoted by a URI?  In OWL terms, this needs an object property. 
It's OK if ther4e's no such way, as one can always introduce new properties, but 
it seems odd to me that data values are OK but other values are not.


prov:wasStartedBy:  "The activity did not exist...".  I've come across this 
phrase several times in reading the PROV documents, and each time find it a bit 
odd.  I would find something like "the activity was not in progress..." to be 
less jarring.  (Activities start and stop, rather than winking in an d out of 
existence.)  It's just a nit, not a big deal.


Property prov:hasAnchor:  I have a suggestion to drop this from PROV-AQ, as its 
intended use is now covered by prov:specializationOf.  Maybe should be marked 
"at risk"?


B. Names of inverse properties

Para 1: I count *two* exceptions, not just one (used and invalidated)

Para 1, final sentence "This extra effort must be avoided".  I don't agree - the 
extra effort may be just what is needed (or I don't understand what is being 
suggested).  Suggest drop this sentence.

Also, again, why the two exceptions?

Para 4:  "...they may be motivated to assert the inverse of...".  I don't think 
"asserting" the inverse is a problem here.  It's defining a new symbol for it. 
Suggest "...they may be motivated to introduce an inverse property name for..."

...

End of comments.

#g
--

Received on Monday, 9 July 2012 14:56:10 UTC