Re: Internal Review Prov Dictionary

Hi Simon,

thanks a lot for your review. I've included your suggested changes in the
document and responded below.

One matter that confused me was whether insertion and removal are
> operations, i.e. activities that happen to one dictionary to create
> another, or differences, i.e. a comparison between two dictionaries. In the
> end, given the constraints you define, I decided they must be differences,
> e.g. d2 wasDerivedByInsertionFrom d1 means that the difference between d2
> and d1 is that d2 is a superset of d1 with some explicit new entries.
> However, the text (especially Section 3) talks as if they are operations,
> referring to "following insertion" or "after a removal", e.g. "An Insertion
> relation... states that d2 is the dictionary following insertion of
> pairs... into dictionary d1."
>
> This distinction has practical consequences. If I wanted to describe, in
> the provenance data, the complete membership of my office on 2013-04-02, I
> could say:
>
>   entity (simon-office-oncreation, [prov:type='prov:EmptyDictionary'])
>   entity (simon-office-20130402, [prov:type='prov:Dictionary'])
>   derivedFromInsertionFrom (simon-office-20130402, simon-office-oncreation,
>       {("on-black-chair", "simon"})
>
> >From this, I would know that Simon was the only member of
> simon-office-20130402.  However, the last line is only true information if
> insertion is merely the difference between the two dictionaries, and other
> insertions and removals could have occurred in between time. If insertion
> is an operation, then it suggests no-one has entered or left my office
> since its creation, which is untrue. In summary, I think it would help to
> clarify in the text whether insertion and removal should be read as diffs
> or operations. If they can be interpreted as diffs, then I think that makes
> the model more flexible.
>
>
We agree with your comments, and this is indeed how a dictionary should be
used in the context of provenance. This way, you could specify for example
the members of a baseball team from season to season,  without having to
use a different insertion/removal every time someone leaves during the
season.
I've modified to explanatory text for insertion/removal to clarify this:

> An Insertion relation prov:derivedByInsertionFrom(id; d2, d1, {(key_1,
> e_1), ..., (key_n, e_n)}) states that d2 is the dictionary following the
> insertion of key-entity pairs (key_1, e_1), ..., (key_n, e_n) into
> dictionary d1. In other words, the set of key-entity pairs (key_1, e_1),
> ..., (key_n, e_n) is to be seen as the difference between d1 and d2. Note
> that this key-entity-set is considered to be complete. This means that we
> assume that no unknown keys were inserted in or removed from a dictionary
> derived by an insertion relation. This is formalized in Inference D8.
>

Dito for removal, and I've made sure to revise statements like "after
removal". Does this address your concerns sufficiently?

Other comments:
>
> Section 3: Just above Example 1: "to explicitly state that a dictionary is
> empty, it is recommended that the prov:type prov:EmptyCollection is used".
> Shouldn't that be prov:EmptyDictionary?
>
> Yes indeed, well spotted!


> It seems a shame that "keys cannot repeated in the same dictionary", as it
> is somewhat of a restriction, but I understand it makes the update and
> removal semantics a lot cleaner and seems justified for that reason.
>

Yes, that is the reason. Otherwise we would have to leave the constructs
very unconstrained, and we gathered from the comments of the group that the
constraints are exactly what makes it interesting to use a dictionary
instead of a collection.


> Example 2: "// d1 is a dictionary" -> "d1" should be "d"
>
> Section 4.1: "PROV-Dictionary provides no dedicated syntax for Collection
> and EmptyCollection." - I think you mean Dictionary and EmptyDictionary?
>
> Example 9: "d1 is the identifier" -> should be "d3"
>
> Section 5: "(Note that this file is unfinished at the time of this working
> draft" -> this will no longer be a working draft on next release.
>
> Section 5.2: "Class: prov:Dictionary back to overview" (and similarly for
> other definitions) - I assume that the "back to overview" should not be
> part of the title but a separate link? Otherwise, I don't understand the
> title.
>
> Section 5.2, prov:Dictionary definition: "are said to be member" -> should
> be "members"
>
> All updated. Thanks for spotting these typos!


> Inference D6: "and K1 is a set of keys" - K1 does not appear in the
> inference rule.
>
> Indeed, this was supposed to go with D7.


> Constraint D11 is called D10 (so there are two D10s).
>
Done.

Could you give the updated document a quick glance and tell us whether your
comments are resolved?
https://dvcs.w3.org/hg/prov/raw-file/default/dictionary/Overview.html
Thanks!

- Tom

Received on Wednesday, 10 April 2013 09:21:13 UTC