Re: PROV comments from Clark&Parsia

Hi Evren,

Thanks for your feedback.  I just wanted to respond to the comments about the constraints to make sure I understand them, to inform (my part of) the eventual formal response from the WG.

On Jan 4, 2013, at 5:23 PM, Evren Sirin <evren@clarkparsia.com> wrote:

> Hi all,
> 
> We are working towards supporting PROV inferences and constraints in
> our RDF database Stardog [1]. Below are some comments about PROV
> specification documents that we identified while working on our
> implementation.
> 
> * PROV Constraints
> 
> 1. The flow control arrows in Figure 1 seem to be backwards.

I believe this is because the arrows correspond to "derivation/generation/use" steps, which go backwards in time in PROV.

> 2. Definition 2.1 seems to be missing the id on the right-hand side.

Actually, since this definition applies to entity, agent or activity statements, the "id" argument is just the first ordinary parameter, namely a_1.  Ids cannot be optional for these statements so they are treated uniformly with the other parameters.  It would be equivalent to write out the three equivalences:

entity(id) IF AND ONLY IF entity(id,[])
activity(id,t1,t2)  IF AND ONLY IF activity(id,t1,t2,[])
agent(id) IF AND ONLY IF agent(id,[]).
  
spelling out how to expand attribute lists for entity, activity and agent, and perhaps we should just do this to avoid confusion?

> 3. Since uniqueness constraints are ‘applied’ and can derive new
> atoms, it is misleading to call them constraints. The same applies to
> typing constraints.

This is a reasonable view, but I'd prefer to stick with the existing terminology.  The rationale for it is as follows:

- uniqueness "constraints" do not infer new PROV statements, rather they result in either merging two statements that contain compatible information, or in failure.
- typing "constraints" do not infer new PROV statements, rather they infer auxiliary non-PROV atomic formulas about typing of identifiers, which can in turn lead to detecting invalidity.
- ordering "constraints" do not infer new PROV statements, rather they infer auxiliary ordering formulas that could in turn lead to invalidity.
- only "inferences" result in new PROV statements being added to the instance, and only "constraints" can result in failure.

Of course, logically, distinguishing between PROV statements and auxiliary atoms is arbitrary, and there is no real reason to distinguish between inferences and constraints - we could simply call everything an inference.  If there is a general consensus that the existing terminology is confusing to implementors we will revisit this.

> 4. The definition of enforcement of uniqueness constraints states one
> should apply the resulting substitution to the whole PROV instance.
> However, the scope of the variables is not sets of rules.

I'm not sure I understand this comment.  I believe you are indicating that the explanation of what it means to apply a uniqueness constraint is unclear, because it has nonlocal effects on the whole instance that go beyond the statements mentioned in the constraint.  Will explaining this more clearly (possibly via an example) help, or am I missing the point?

> 5. Inference 9 (wasStartedBy-inference) should be: IF
> wasStartedBy(_id; _a,e1,a1,_t,_attrs), THEN there exist _gen and _t1
> such that wasGeneratedBy(_gen; e1,a1,_t1,[]).
> 6. Inference 10 (wasEndedBy-inference) should be: IF wasEndedBy(_id;
> _a,e1,a1,_t,_attrs), THEN there exist _gen and _t1 such that
> wasGeneratedBy(_gen; e1,a1,_t1,[]).
> 7. Inference 15.4 should be: IF wasStartedBy(id; a2,e,_a1,_t,attrs)
> THEN wasInfluencedBy(id; a2, e, attrs).
> 8. Inference 15.7 should be: IF wasDerivedFrom(id; e2, e1, _a, _g, _u,
> attrs) THEN wasInfluencedBy(id; e2, e1, attrs).

Here (in 5-8), the changes you are proposing are just to add "_"s before some variables that only appear once, correct?  This seems fine.

> 9. Constraint 56 should be: IF hadMember(c,e) and
> 'prov:EmptyCollection' ∈ typeOf(c) THEN INVALID.
> 

I believe here you are pointing out that we wrote "hasMember" which should have been "hadMember", right?  If so, you're right - thanks!

If I understand correctly I believe none of the above changes affect the technical content of the specification (i.e. adopting them will not break existing implementations), so it should be no problem to address them in a future version of PROV-CONSTRAINTS. 

Thanks again for your comments and careful reading,

--James



-- 
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

Received on Tuesday, 8 January 2013 10:36:55 UTC