Re: PROV-DICTIONARY internal review for first public working draft (ISSUE-614) from James Cheney on 2013-01-22 (public-prov-wg@w3.org from January 2013)

From: James Cheney <jcheney@inf.ed.ac.uk>
Date: Tue, 22 Jan 2013 17:52:24 +0000
To: Tom De Nies <tom.denies@ugent.be>
Cc: Provenance Working Group <public-prov-wg@w3.org>
Message-Id: <3B23092D-E143-4223-866C-79EE5E35D6EA@inf.ed.ac.uk>

('binary' encoding is not supported, stored as-is)

Hi,

Here are my review comments.

--James

> Questions for reviewers
> - Is the notation of Dictionary concepts clear & acceptable for you? (in PROV-N, PROV-O and/or PROV-XML)

Yes

> - Are the constraints acceptable, or are they too loose/too strict?

See below

> -- In particular, can the constraint "IF derivedByRemovalFrom(d2, d1, {"k1"}) THEN hadDictionaryMember(d1, e1, "k1") " be dropped, or do you strongly support it?

Happy with dropping it

> - Is the name PROV-DICTIONARY appropriate for the document?

Yes

> - Can this be released as a first public working draft?

Yes, unless suggestions are controversial.  

I think this is a reasonable working draft for the purpose of getting feedback from potential adopters.  The constraints need some work, but this does not need to be ironclad at this stage.

> - If not, where are the blocking issues?

> - If yes, are there other issues to work on?

See suggestions below.

=== Detailed review ===

General things:

1.  The syntax for insertion and deletion allows multiple keys to be inserted or deleted at once, but in several places the document considers only the special case of a single insertion or deletion.  It isn't always obvious how to generalize to handle arbitrary multiple insertions or deletions (if this is intended)

2.  It would be nice if the constraints/inferences had names or numbers.

3. The superscripts op and dp, and other conventions from the prov-o CR, are not explained locally; please mention where these are explained.

Constraints:
--
> IF hadDictionaryMember(d1, e1, "k1") and derivedByInsertionFrom(d2, d1, {("k2", _e2)}) and k1 ≠ k2 THEN hadDictionaryMember(d2, e1, "k1")

In general this could be 
IF hadDictionaryMember(d1, e, "k") and derivedByInsertionFrom(d2, d1, {("k1",e1),…,("kn",en)}) and k \notin {k1,…,kn}  THEN hadDictionaryMember(d2, e, "k")

--
> IF hadDictionaryMember(d1, e1, "k1") and derivedByInsertionFrom(d2, d1, {("k1", e2)}) THEN hadDictionaryMember(d2, e2, "k1"))
> 
This constraint is a special case of the next one:

> IF derivedByInsertionFrom(d2, d1, {("k1", e1)}) THEN hadDictionaryMember(d2, e1, "k1")

The above constraint could be generalized to:

IF derivedByInsertionFrom(d2, d1, {("k1", e1),…,("kn",en)}) THEN hadDictionaryMember(d2, ei, "ki") (for each i in [1..n]).

--
> IF derivedByRemovalFrom(d2, d1, {"k1"}) THEN hadDictionaryMember(d1, e1, "k1")

can likewise be generalized.  However, it is also proposed for deletion, which is fine with me too.

--
> IF derivedByInsertionFrom(d2, d1, {_("k1", e1)}) THEN wasDerivedFrom(d2, d1)
> 
> IF derivedByRemovalFrom(d2, d1, {_"k1"}) THEN wasDerivedFrom(d2, d1)
> 
These can both be immediately generalized to arbitrary key-value sets or key sets.  

I think the underscores above are typos, though.

--
> IF derivedByInsertionFrom(d2, d1, {("k1", _e1)}) and derivedByRemovalFrom(d3, d2, {"k1"}) THEN hadDictionaryMember(d1, e2, "k2") holds IF AND ONLY IF hadDictionaryMember(d3, e2, "k2")

Two potential problems:
1. You need to be careful here, because if k1 = k2 then there is no reason to believe that its value will be the same in d1 and d3, as it has been deleted and then re-inserted with a potentially different entity value.  

2. This inference has more complex structure than those in the constraints.  I think you mean:

IF derivedByInsertionFrom(d2, d1, {("k1", _e1)}) and derivedByRemovalFrom(d3, d2, {"k1"}) 

THEN

(for all k2, e2. hadDictionaryMember(d1, e2, "k2") holds IF AND ONLY IF hadDictionaryMember(d3, e2, "k2")

This potentially goes beyond the formalism we have been using in the constraints, but I think you can decompose it into the two constraints, which (I think) have the same effect:

IF derivedByInsertionFrom(d2, d1, {("k1", _e1)}) and derivedByRemovalFrom(d3, d2, {"k1"}) and hadDictionaryMember(d1, e2, "k2") and k1 \neq k2 THEN hadDictionaryMember(d3, e2, "k2")

IF derivedByInsertionFrom(d2, d1, {("k1", _e1)}) and derivedByRemovalFrom(d3, d2, {"k1"}) and hadDictionaryMember(d3, e2, "k2") and k1 \neq k2 THEN hadDictionaryMember(d1, e2, "k2")

Alternatively, this constraint could be dropped, as previous constraints allow us to reason about the contents of a dictionary after an insertion and deletion of the same key.  This should avoid both problems.

--

> IF derivedByRemovalFrom(d2, d1, {_"k1"}) and derivedByInsertionFrom(d2, d1, {_("k2", e2)})THEN INVALID

This is also fine if you allow an arbitrary insertion and deletion set.

--
> IF IF derivedByInsertionFrom(d2, d1, {_("k1", e1)}) and derivedByInsertionFrom(d2, d1, {_("k2", e2)})THEN INVALID

Extra "IF", unnecessary underscores

> IF derivedByRemovalFrom(d2, d1, {_"k1"}) and derivedByRemovalFrom(d2, d1, {_"k2"})THEN INVALID

Unnecessary uderscores.

Also, I think the above two constraints might really be functional dependencies in disguise.  That is, would 

IF derivedByInsertionFrom(d2, d1,KV1) and derivedByInsertionFrom(d2, d1, KV2) THEN KV1 = KV2

IF derivedByRemovalFrom(d2, d1, K1) and derivedByRemovalFrom(d2, d1, K2)THEN K1=K2

be acceptable instead?  If so, that is sensible as a first step, but considering equalities on sets of keys/KV pairs is a potential complication.

--
> IF hadDictionaryMember(d, e1, "k1") and 'prov:EmptyDictionary' ∈ typeOf(d)THEN INVALID

This seems to follow already from the earlier inference (hadDictionaryMember implies hadMember) and the fact that an empty dictionary is an empty collection.  I don't see a problem with making it explicit, though.

--
Minor things:

> prov:EmptyDictionary is a subtype of prov:EmptyCollection. It denotes an empty dictionary.

And a subtype of prov:Dictionary, right? This seems to be said in the ontology later.

>   Thus, :c3 does not contain the members ("k1", :e1) and ("k2", :e2( from :c2.

On Jan 16, 2013, at 8:17 AM, Tom De Nies <tom.denies@ugent.be> wrote:

> Hi everyone,
> 
> Our apologies that this mail did not go out sooner. We had some trouble with our university mailing server, and couldn't send any mails anymore from our approved email addresses.
> I sent an email to the list from another address on Friday, but apparently it didn't get through. 
> 
> PROV-DICTIONARY is now ready for internal review.
> This document is on the NOTE track, and we'd like to publish a working draft by the time the RECs go to PR.
> 
> The latest editor's draft is here: https://dvcs.w3.org/hg/prov/raw-file/default/dictionary/prov-dictionary.html#dictionary-xml-schema
> 
> The following people volunteered for reviewing the document: Paolo, Stian, James (maybe), Luc, and Paul, but others are also welcome to review of course.
> If you only have bandwidth to review part of the document (e.g. only the ontology section), that could be useful as well.
> 
> Questions for reviewers
> - Is the notation of Dictionary concepts clear & acceptable for you? (in PROV-N, PROV-O and/or PROV-XML)
> - Are the constraints acceptable, or are they too loose/too strict?
> -- In particular, can the constraint "IF derivedByRemovalFrom(d2, d1, {"k1"}) THEN hadDictionaryMember(d1, e1, "k1") " be dropped, or do you strongly support it?
> - Is the name PROV-DICTIONARY appropriate for the document?
> - Can this be released as a first public working draft?
> - If not, where are the blocking issues?
> - If yes, are there other issues to work on?
> 
> In your review please include ISSUE-614
> 
> Due to the delay in sending this notification, I suggest we allow a little more time to review the document. 
> We propose the due date for review to be on Wednesday the 23rd, so that we can vote on the revised document on the 24th.
> 
> Thanks in advance to all the reviewers.
> 
> Regards,
> Tom & Sam

The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.

Received on Tuesday, 22 January 2013 17:52:50 UTC