Re: PROV-ISSUE-138 (collection-collision): Collection does not describe multiple additions/replacements [Data Model]

This issue is still open.

As argued for ISSUE-136 I think
CollectionAfterInsertion/CollectionAfterRemoval should be functional
on all parameters, so multiple insertions/removals can't be done in a
single go.


Thus my original suggestion still stands - modified below for new syntax:


Suggested:
CollectionAfterInsertion(c2,c1, k1, v1)
asserts that c2 now contains the key k1, v1 - but the collection c2 is
derived from, c1, did not contain the key k1. c1 might contain v1
under a different key.



CollectionAfterInsertion(c2,c1, k1, v1)
CollectionAfterInsertion(c2,c1, k2, v2)

is not valid in PROV-DM unless k1==k2 and v1==v2


If the asserter wants to describe replacement of the same key k1, such
as in a dictionary, an intermediate collection would be needed:

CollectionAfterInsertion(c1, c, k1, v1)
CollectionAfterRemoval(c2, c1, k1)
CollectionAfterInsertion(c3, c2, k1, v2)


If the asserter wants to describe multiple values for the same key,
the value should be a nested Collection which is built separately.



If the asserter wants to assert that multiple key-values where
added/removed, but don't want to assert their order, they are out of
luck. (Add CollectionAfterUnion/Intersection/Complement ?)

Similarly, if an asserter simply wants to say that a collection did
contain something, they are out of luck. (Add CollectionContained ?)

If an asserter wants to say that two collections are equal, they are
out of luck. (Add CollectionsEqual ?)


On Sun, Oct 30, 2011 at 00:46, Provenance Working Group Issue Tracker
<sysbot+tracker@w3.org> wrote:
>
> PROV-ISSUE-138 (collection-collision): Collection does not describe multiple additions/replacements [Data Model]
>
> http://www.w3.org/2011/prov/track/issues/138
>
> Raised by: Stian Soiland-Reyes
> On product: Data Model
>
>
> http://www.w3.org/TR/prov-dm/#expression-Collection introduces relations for expressing collection modifications:
>
>
> Expression: wasAddedTo_Key(c,k) (resp. wasRemovedFrom_Key(c,k)) denotes that collection c had a new value with key k added to (resp. removed from) it.
>
>
> It is not clear what would happen if a second addition added a value with the same key.
>
> Imagine:
>
>  wasAddedTo_Coll(c2,c1)
>  wasAddedTo_Key(c2,k1)
>  wasAddedTo_Entity(c2,e1)
>
>  wasAddedTo_Coll(c3,c2)
>  wasAddedTo_Key(c3,k1)
>  wasAddedTo_Entity(c3,e2)
>
> It is clear that c3 contains (k1, e2). Does it also contain (k1, e1)?
>
> I understand this is meant as a general collection, and the interpretation of this might as well depend on the specific type of collection. However that means that without knowing the type of collection you can't tell if e1 is contained in c3 or not.
>
>
> I believe we should not allow automatic replacement, and neither allow multiple values for the same key.  So if asserting:
>
>  wasAddedTo_Coll(c2,c1)
>  wasAddedTo_Key(c2,k1)
>  wasAddedTo_Entity(c2,e1)
>
>
> c1 does *not* contain k1, but *might* have e1 under a different key.
>
>
> Suggested:
> wasAddedTo_Coll(c2,c1); wasAddedTo_Key(c2,k1) asserts that c2 now contains the key k1 - but the collection c2 is derived from, c1, did not contain the key k1.
>
> A second addition of the same key without an intermediate removal is not valid in PROV. If the specific type of collection is performing replacement by key, such as a dictionary/map/hashtable, then the asserter will need to model an intermediate wasRemovedFrom_Coll(intermediate, original) and  wasAddedTo_Coll(new, intermediate). If the specific type of collection contains multiple values per key, then the value should instead be asserted as a new nested collection of those values.
>
>
>
>
>



-- 
Stian Soiland-Reyes, myGrid team
School of Computer Science
The University of Manchester

Received on Wednesday, 22 February 2012 16:19:23 UTC