Re: views, complements and invariants from Graham Klyne on 2011-09-01 (public-prov-wg@w3.org from September 2011)

From: Graham Klyne <Graham.Klyne@zoo.ox.ac.uk>
Date: Thu, 01 Sep 2011 12:43:40 +0100
To: "Myers, Jim" <MYERSJ4@rpi.edu>
CC: Khalid Belhajjame <Khalid.Belhajjame@cs.man.ac.uk>, Paul Groth <p.t.groth@vu.nl>, "public-prov-wg@w3.org" <public-prov-wg@w3.org>
Message-ID: <4E5F6FEC.20803@zoo.ox.ac.uk>
On 29/08/2011 20:42, Myers, Jim wrote:
>
>
>> -----Original Message-----
>> From: Graham Klyne [mailto:Graham.Klyne@zoo.ox.ac.uk]
>> Sent: Thursday, August 25, 2011 6:56 AM
>> To: Myers, Jim
>> Cc: Khalid Belhajjame; Paul Groth; public-prov-wg@w3.org
>> Subject: views, complements and invariants (was: updates to PAQ doc for
>> discussion)
>>
>> On 20/08/2011 20:26, Myers, Jim wrote:
>>> Graham, I'd like to have something like you describe, but I don't think you
>> can really make the inferences you want to for the general case, and don't
>> necessarily see why version helps versus view. I'm not opposed to having a
>> subtype of view for version, but I'm not sure how to make it rigorous...
>>>
>>> Taking those parts in order:
>>>
>>> 1) The problems I have with inferring authorship/editorship have to do with
>> the fact that not all edits are equal. Someone who just fixes grammar might
>> not be an author/editor in the final doc. Changes that are put in in one
>> version and removed in another may or may not indicate that an intellectual
>> contribution has been made (my text doesn't make the final version but
>> other text is added in some other part of the doc because of what I
>> contributed...am I an author/editor or not?). To me these types of issues
>> really indicate that a document is not just a more flexible version of a file/file-
>> like version, that edit operations aren't really occuring on the same type of
>> thing as editorial/intellectual contributions are made on, etc. So we really
>> have the IVPof/view case although we try to pretend that is really
>> hierarchical and just a matter of more/less constrained versions of the same
>> thing.
>>
>> I think this comment is mostly to do with the specifics (i.e. weakness) of my
>> example used to illustrate the desideratum.
>>
>> It would be easier to talk about this simply in terms of time-varying
>> resources.
>>    Let's try the weather report example again:
>>
>> (Weather in London at 12:00 on 1-Jan-2000)
>>      isViewOf (Weather in London on 1-Jan-2000) (Weather in London on 1-Jan-
>> 2000)
>>      isViewOf (Weather in London)
>>
>> I think it can be useful to infer from this that:
>>
>> (Weather in London at 12:00 on 1-Jan-2000)
>>      isViewOf (Weather in London)
>>
>>   From this, I would expect any provenance statements that are generally true
>> for (Weather in London) are also true for (Weather in London at 12:00 on 1-
>> Jan-2000).  That is invariants are preserved forward across isViewOf relations
>> (and others may be introduced)
>
> So the historic temperature range of (Weather in London) is also applicable to and true for (Weather in London on 1-Jan-2000)? If El Nino influences the weather in London it did so on Jan 1 as well? (I hope these are relevant issues going in the right direction...)

I would say so.  In general, any assertion for resource R that is true 
throughout the interval [t0,t1] is also true for any sub-interval of [t0,t1].

> I don't think the problem is with a specific example - its more that we've got some hidden assumptions in the way we talk about/think about these examples and those assumptions aren't always true/consistent. I can see the weather at one point in time as just a snapshot but then I use the same (Weather in London) identifier to talk about something that has temporal patterns or averages and ranges which are inconsistent with that view. And what we'd like to infer (say that the temperature on Jan 1 had to be within the temperature range of the overall weather) crosses those views in ways that probably requires making the IVPof-style transitions explicit (and may also require adding some domain knowledge).

I see the time-range constraint as one kind of explicit IVPof-style 
contextualization constraint (which I don't see as depending on domain 
knowledge).  There may be other constraints that do (e.g. Halley's Comet within 
some determined distance of the sun, to echo an earlier example).


>
>>
>>> 2) Regarding the question of why version does something better that
>>> view - if X is a view of A and Y is another view of A, why wouldn't I
>>> think inferring creatorship/editorship is OK? (I'm claiming above that
>>> inferring is probably not valid in some cases - here I'm asking
>>> whether version does a better job of cutting down on those cases
>>> versus view.) I.e. if you wrote the bits to a section of a disk that
>>> is an anIVPof/view of a file, why wouldn't it be just as valid or
>>> invalid as trying to make that inference between a doc and a version
>>> of it? How does a hierarchical meaning help? (I guess I'm assuming
>>> that IVPof is one-way like version and for my disk versus file use
>>> case here I would actually assert IVPof in both directions so I could
>>> infer the file creator also wrote the bits to the disk and vice versa
>>> whereas with your doc/version case the IVPof relationship would go one
>>> way. So, rephrasing the question here - I'd agree that inference
>>> should only go in the direction of t
>>
>> he relationship, but if there are relationships in both directions, wouldn't
>> inferrencing be just as valid for that case?)
>>
>> Maybe I'm misunderstanding the intent, but what I interpret from
>>
>>     if X is a view of A and Y is another view of A
>>
>> is that (X complementOf Y) and (X complementOf Y) in the previous
>> terminology.
>>   From this, I can't see how my knowledge of X alone allows me to in fer
>> anything about Y.  (I already accept the relevance of this kind of relation for
>> dealing with accounts.)
>
> I don't think this is how complementOf is defined - the model doc has
> isComplementOf(rs_l1, rs)
> isComplementOf(rs_l2, rs)
>
> but does not have
> isComplementOf(rs_l1, rs_l2)
>
> I dislike the name because it suggests the interpretation you give, but as far as I can tell, the usage is not different.

Hmmm... IIRC, the definition is in terms of attributes having common values.

So, yes, you are right:  my inference applies *only* when A has invariant 
attributes that are true of every IVPof A.

As I read it, complementOf is *not* the same as was originally meant by IVPof. 
I think it was Khalid who pointed out that complementOf serves the needs of 
talking about different accounts of provenance, in which the transitive nature 
of IVPof no longer applies.

>>> 3) I would tie the use cases together and rather than looking to infer
>>> authorship/editorship from view or version relationships, I would see
>>> any differences in who's listed for the doc and the aggregate list
>>> from each version as an indication that there's been an error, a lie,
>>> or the provenance is just not complete (intellectual contributions
>>> haven't been separated from text/file-level edits, one version isn't
>>> really 'derivedfrom' another when I look at more granularity in the
>>> files or processes, etc.)
>>>
>>> A version relationship may still be a useful, particularly if we agree that it
>> allows inferencing as you want (i.e. you only use version instead of view
>> when you want people to infer authorship/editorship/(what else can I
>> infer?) -view shouldn't work that way, version could though there would be
>> cases where the English language meaning and this technical definition would
>> be at odds (the examples I've given).
>>>
>>> If we do that, I think version would have tol only be valid within an
>>> account - i.e. the notion of version is an indication that, for the
>>> set of processes being reported, the asserter believes one can
>>> consider the view relationships hierarchical/transitive/version-like
>>> and inferrencing is OK. If I take two accounts that use version and
>>> merge them, I may find that the set of processes they describe will
>>> break versioning - versions might have to be interpreted as views
>>> because of the additional info (Perhaps this example works: if you use
>>> version to indicate text changes in a doc and I use version to
>>> describe multiple copies (file versions) of one logical file (one of
>>> your versions of a doc), I think both might be internally consistent,
>>> but together they'd imply that every person who copied a version of
>>> your doc was an author/editor which is not what you intended). Perhaps
>>> version being account-limited is still OK - PIL is an assertion
>>> language and so an asserter may be wrong
>>
>> and it may be possible that they are wrong about a version relationship while
>> still being right about their being a view relationship...
>>
>> Hmmm... I'm not sure I go with this.  You seem to be saying that the truth of
>> provenance assertions about a resource (as opposed to secondary
>> properties concerning the available inferences) is contextualized by the
>> account in which they occur.
>
> But the only difference between IVPof and versionsOf by my definitions above is that versionOf allows inferences, so we're saying the same thing - secondary properties concerning inferences would have to be contextualized by the account to capture what you mean by version (or perhaps you're arguing that one shouldn't use subclassing of relationships to define inference characteristics? Given my initial position that we should just have IVPof (which I think is the same as complementOf as discussed above), I'm OK with an argument that my attempt to capture the additional semantics you want for version is not a good way to do it. If it isn't, is there a better way? Or does this argue for leaving versionOf as something one can infer across out of the model completely?).

I /think/ I agree.  If I understand correctly:  the versionOf relation would be 
true independently of any account, but any inferences (e.g. derivedFrom) that 
are account-dependent could only be applicable and inferrable within an account. 
  So, in the attached example diagram, the account-independent "author" 
attribute for C can be inferred for D, but the derivedFrom relationship cannot.

#g
--

>>
>> Examine the account example in the OPM model document
>> (http://eprints.ecs.soton.ac.uk/21449/1/opm.pdf, p6), I see two accounts
>> (looking at figure 4):
>>
>>     (3,7) is derived from (2,6) by application of "add1"
>>
>>     3 is derived by 2 by application of "add1"
>>     7 is derived by 6 by application of "add1"
>>
>> These statements come from two different accounts, but I see their truth is
>> independent of the account to which they belong (or of which they are a
>> part).
>> What *does* depend on the account context being considered are
>> assertions that can be made about the overall structure of the provenance
>> graph (e.g. the absence of loops).
>>
>> I find the idea that a provenance assertion may be valid (i.e. True) only for a
>> given account to be surprising.  And if it's part of the resulting provenance
>> model, I think developers will get it wrong.
>>
>> #g
>> --
>
Attachments

image/png attachment: versions-and-accounts.png
Received on Thursday, 1 September 2011 11:48:12 UTC