Re: PROV-DICTIONARY internal review for first public working draft (ISSUE-614) from Tom De Nies on 2013-01-24 (public-prov-wg@w3.org from January 2013)

From: Tom De Nies <tom.denies@ugent.be>
Date: Thu, 24 Jan 2013 11:52:15 +0100
To: James Cheney <jcheney@inf.ed.ac.uk>
Cc: Provenance Working Group <public-prov-wg@w3.org>
Message-ID: <CA+=hbbeD9Cy=CewYBK0VY0+mvVp=46ZNyg_OtZteT3B6aWoTuQ@mail.gmail.com>
Hi James,

Thanks a lot for your extensive review of the constraints. Your comments
were very useful in making the constraints more robust.

I've included our responses to your review inline below.


> === Detailed review ===
>
> General things:
>
> 1.  The syntax for insertion and deletion allows multiple keys to be
> inserted or deleted at once, but in several places the document considers
> only the special case of a single insertion or deletion.  It isn't always
> obvious how to generalize to handle arbitrary multiple insertions or
> deletions (if this is intended)
>

I assume you mean in the constraints? In the conceptual and prov-n sections
multiple key-value pairs are used in the examples. I've generalized the
constraints according to your suggestions.


>
> 2.  It would be nice if the constraints/inferences had names or numbers.
>
>
Done.


> 3. The superscripts op and dp, and other conventions from the prov-o CR,
> are not explained locally; please mention where these are explained.
>
> True. I've added a reference to PROV-O


> Constraints:
> --
>
> IF hadDictionaryMember(d1, e1, "k1") and derivedByInsertionFrom(d2, d1,
> {("k2", _e2)}) and k1 ≠ k2 THEN hadDictionaryMember(d2, e1, "k1")
>
> In general this could be
>
> IF hadDictionaryMember(d1, e, "k") and derivedByInsertionFrom(d2, d1,
> {("k1",e1),…,("kn",en)}) and k \notin {k1,…,kn}  THEN hadDictionaryMember(d2,
> e, "k")
>
Done.

--
>
> IF hadDictionaryMember(d1, e1, "k1") and derivedByInsertionFrom(d2, d1,
> {("k1", e2)}) THEN hadDictionaryMember(d2, e2, "k1"))
>
> This constraint is a special case of the next one:
>

Indeed, I removed it and added the text about update semantics to inference
4.

>
> IF derivedByInsertionFrom(d2, d1, {("k1", e1)}) THEN hadDictionaryMember(d2,
> e1, "k1")
>
> The above constraint could be generalized to:
>
> IF derivedByInsertionFrom(d2, d1, {("k1", e1),…,("kn",en)}) THEN hadDictionaryMember(d2,
> ei, "ki") (for each i in [1..n]).
>
Done.

> --
>
> IF derivedByRemovalFrom(d2, d1, {"k1"}) THEN hadDictionaryMember(d1, e1,
> "k1")
>
> can likewise be generalized.  However, it is also proposed for deletion,
> which is fine with me too.
>

Since all reviewers (and editors) were in favor of deletion, this one has
been taken out.


> --
>
> IF derivedByInsertionFrom(d2, d1, {_("k1", e1)}) THEN wasDerivedFrom(d2,
> d1)
>
> IF derivedByRemovalFrom(d2, d1, {_"k1"}) THEN wasDerivedFrom(d2, d1)
>
> These can both be immediately generalized to arbitrary key-value sets or
> key sets.
>
> I think the underscores above are typos, though.
>
> Fixed.


> --
>
> IF derivedByInsertionFrom(d2, d1, {("k1", _e1)}) and derivedByRemovalFrom(d3,
> d2, {"k1"}) THEN hadDictionaryMember(d1, e2, "k2") holds IF AND ONLY IF hadDictionaryMember(d3,
> e2, "k2")
>
> Two potential problems:
> 1. You need to be careful here, because if k1 = k2 then there is no reason
> to believe that its value will be the same in d1 and d3, as it has been
> deleted and then re-inserted with a potentially different entity value.
>
> 2. This inference has more complex structure than those in the
> constraints.  I think you mean:
>
> IF derivedByInsertionFrom(d2, d1, {("k1", _e1)}) and derivedByRemovalFrom(d3,
> d2, {"k1"})
>
> THEN
>
> (for all k2, e2. hadDictionaryMember(d1, e2, "k2") holds IF AND ONLY IF hadDictionaryMember(d3,
> e2, "k2")
> This potentially goes beyond the formalism we have been using in the
> constraints, but I think you can decompose it into the two constraints,
> which (I think) have the same effect:
>
> IF derivedByInsertionFrom(d2, d1, {("k1", _e1)}) and derivedByRemovalFrom(d3,
> d2, {"k1"}) and hadDictionaryMember(d1, e2, "k2") and k1 \neq k2 THEN hadDictionaryMember(d3,
> e2, "k2")
> IF derivedByInsertionFrom(d2, d1, {("k1", _e1)}) and derivedByRemovalFrom(d3,
> d2, {"k1"}) and hadDictionaryMember(d3, e2, "k2") and k1 \neq k2 THEN hadDictionaryMember(d1,
> e2, "k2")
>
> Alternatively, this constraint could be dropped, as previous constraints
> allow us to reason about the contents of a dictionary after an insertion
> and deletion of the same key.  This should avoid both problems.
>
>
I've changed them according to your suggestion, which I think solves both
problems, correct? I wouldn't be entirely comfortable with dropping them,
since we then lose the ability to state that a dictionary is complete
through reasoning. (Unless it can be proven that this is still possible
without these constraints)


> --
>
> IF derivedByRemovalFrom(d2, d1, {_"k1"}) and derivedByInsertionFrom(d2,
> d1, {_("k2", e2)})THEN INVALID
>
> This is also fine if you allow an arbitrary insertion and deletion set.
>
> Done.

> --
>
> IF IF derivedByInsertionFrom(d2, d1, {_("k1", e1)}) and derivedByInsertionFrom(d2,
> d1, {_("k2", e2)})THEN INVALID
>
> Extra "IF", unnecessary underscores
>
> Done. (and well spotted!)

> IF derivedByRemovalFrom(d2, d1, {_"k1"}) and derivedByRemovalFrom(d2, d1,
> {_"k2"})THEN INVALID
>
> Unnecessary uderscores.
>
> Also, I think the above two constraints might really be functional
> dependencies in disguise.  That is, would
>
> IF derivedByInsertionFrom(d2, d1,KV1) and derivedByInsertionFrom(d2, d1,
> KV2) THEN KV1 = KV2
>
> IF derivedByRemovalFrom(d2, d1, K1) and derivedByRemovalFrom(d2, d1, K2)
> THEN K1=K2
> be acceptable instead?  If so, that is sensible as a first step, but
> considering equalities on sets of keys/KV pairs is a potential complication.
>
>
I've changed them according to your suggestion.
The notation here will have to change in the next draft, I suggest we make
an issue out of it after the draft has been published, and discuss via
email how to fix it by the next iteration.

--
>
> IF hadDictionaryMember(d, e1, "k1") and 'prov:EmptyDictionary' ∈ typeOf(d)
> THEN INVALID
>
> This seems to follow already from the earlier inference
> (hadDictionaryMember implies hadMember) and the fact that an empty
> dictionary is an empty collection.  I don't see a problem with making it
> explicit, though.
>
> Correct. I've removed it, since we already have plenty of constraints, and
can do without redundant ones :)


> --
> Minor things:
>
> prov:EmptyDictionary is a subtype of prov:EmptyCollection. It denotes an
> empty dictionary.
>
>
> And a subtype of prov:Dictionary, right? This seems to be said in the
> ontology later.
>
> Indeed, this was overlooked. Fixed.

>
>
Received on Thursday, 24 January 2013 10:52:45 UTC