Re: new release of PROV-DM document from Khalid Belhajjame on 2011-09-21 (public-prov-wg@w3.org from September 2011)

From: Khalid Belhajjame <Khalid.Belhajjame@cs.man.ac.uk>
Date: Wed, 21 Sep 2011 20:22:40 +0100
To: Luc Moreau <L.Moreau@ecs.soton.ac.uk>
CC: Provenance Working Group WG <public-prov-wg@w3.org>
Message-ID: <4E7A3980.1050604@cs.man.ac.uk>
Hi,
Here are some comments on the current working draft of the provenance model.
- 2.3 Representation, Assertion, and Inference
“Different asserters will normally contribute different representations, 
and no attempt is made to define a notion of consistency of such 
different sets of assertions.”
I also think that we should not attempt to define or ensure the 
consistency of the assertions made by the same asserter.
- 3 PROV-DM Overview
In the diagram illustrating the high level overview of the PROV-DM:
• I find the name “entity characterization” given to the association 
between “Entity” and “characterizing attribute” confusing. Would is be 
more sensible to name using a term such as characterizing attribute, and 
to replace “characterizing attribute” in the diagram by “attribute”.
• The cardinality of the association “entity characterization are not 
specified”. I guess there are 0,* on the side of “characterizing 
attribute”, and 1 on the side of “Entity”.
• Would it be useful to associate Entity with the Interval in which it 
is valid?
• I note that an instance of Entity can be generated at most once by an 
instance of ProcessExecution. I was always assuming that this hold. I am 
no longer sure. To illustrate my doubt, consider the execution of a 
workflow wf1, denoted by the process execution pe0, and consider the 
process execution pe1 corresponding to the last activity actn in the 
workflow wf1. Now, assume that pe1 generated an entity e. given the 
relation between wf1 and actn, it follows that pe0 also generates e. (We 
came across this in the example Taverna workflow that is being encoded 
by Stian in the OWL provenance ontology).
- 4.2 Encoding using PROV-ASN
typo: UsedExpressions -> Used Expressions
- 5.2.4 Annotation
why does annotation identified by an id. Wouldn’t it be better if 
instead having the id of the elementExpression subject to annotation. 
Did you opt for this option because an annotation can apply to multiple 
elements expressions.
The observation also apply to annotationAssociationExpression
- 5.3.3.1 Process Execution Linked Derivation Assertion
In the definition of wasDerivedFrom the qualifier q2 and q1 seems to be 
redundant, as they should, I think, be specified within the context of 
use and generation instead.
You have added a note stating that “Should this dependency of attributes 
be made explicit as argument of the derivation expression? By making it 
explicit, we would allow someone to verify the validity of the 
derivation expression.”
I was thinking of adding derivation-qualifier to wasDerivedFrom(e2,e1), 
but instead of being a set of attribute-value, it can be specified by a 
set of pair s of the form <b,B>, where b is a characterizing attribute 
of e2 and B is the set of characterizing attributes of e1 that were used 
to compute the value of b.
5.5.1 qualifier
“A qualifier’s sequence of name-value pairs MAY be empty”. Wouldn’t make 
sense to require at least the role should be specified in the case of 
use, generation and control?


Thanks, khalid

On 19/09/2011 20:47, Luc Moreau wrote:
>
> Dear all,
>
> Paolo and I have edited the document. We are very aware that
> it needs proof reading and wordsmithing, but we are also keen
> to get feedback from the WG.
>
>
> The document underwent substantial reorganization. Section 2,
> preliminaries, now includes key material setting the context for the
> definition of the data model:
> - conceptualization of the world
> - ASN
> - discussion on representation/assertion/inference
>
> The following issues have been addressed in this version of the document
> and have been closed pending review:
>
> - ISSUE-87: section 2.2 now explains the role of PROV-ASN. Its role
> is now more central in this document (as reflected in the new title).
>
> - ISSUE-86: a high-level overview of the data model is now available
> in section 3.
>
> - ISSUE-71: multiple comments about the example have now been tackled.
>
> - ISSUE-65: extensibility points are now explicitly discussed in 
> section 6.
> To support extensibility, annotations were introduced.
>
> - ISSUE-85: the email discussions have indicated where the origin of
> the confusion arises from. Multiple changes have been introduced to 
> tackle
> this issue:
> - preliminaries section: to introduce conceptual model
> - section 6, PROV DM: refers to 'entity expression' and to 'xxx 
> expression'
>
>
> Cheers,
> Luc
>
>
Received on Wednesday, 21 September 2011 19:23:15 UTC