RE: [Spam:***** SpamScore] Re: formal semantics strawman from Myers, Jim on 2011-08-27 (public-prov-wg@w3.org from August 2011)

From: Myers, Jim <MYERSJ4@rpi.edu>
Date: Sat, 27 Aug 2011 16:43:58 +0000
To: James Cheney <jcheney@inf.ed.ac.uk>, Graham Klyne <GK@ninebynine.org>
CC: W3C provenance WG <public-prov-wg@w3.org>
Message-ID: <3131E7DF4CD2D94287870F5A931EFC23010FDE@EX14MB2.win.rpi.edu>
Trying to catch up after travel. I'm not sure I have the big picture but a few comments about statements in the thread:

We need to avoid saying entities are fixed and ?things are not - entities are fixed in some ways - defined by attributes with values, but they are mutable in other ways. ?things can be this way as well and we may assert entities that have the same ID as an existing ?thing - because that thing is already defined as fixed in the ways we need it to be for provenance, or we may assert entities that are 'complements' of ?things when we need to fix more attributes or different attributes than for the ?thing itself.

 The interval over which an entity exists may be different than the one for which a complementof relationship is true. If we want an asserted entity that is is the fixed content at a live URL, it should have its own ID and a complementof relationship with the site URL(a ?thing and potentially another entity if we wish to discuss the provenance of the live site itself). The complementOf relationship is true until the site content updates, but the fixed page entity could still exist. I think this is consistent with the discussion but am not sure.

The interpretation of an entity should be time invariant - one can choose to assert an entity that is a fixed web page or one for a a live website, but one should not have an entity asserted with a content property as part of its definition and then have that property change. For the example.com example, one can define an entity that is 'the content available from the example2.com site no matter what URL you get it from'' (retrieval URL can't be an attribute here) or one that is 'the content retrievable from example2.com' that is generated by the site starting operations and ceases to exist when the site URL is blocked from the world/retired (retrieval URL can be an attribute here). Either is valid - the point with an entity is that you are picking one definition and sticking with it, using complementof when you need to switch definitions.

Hope those are helpful  in the larger discussion (and consistent with others interpretation!)...

Jim
________________________________________
From: public-prov-wg-request@w3.org [public-prov-wg-request@w3.org] on behalf of James Cheney [jcheney@inf.ed.ac.uk]
Sent: Friday, August 26, 2011 7:17 AM
To: Graham Klyne
Cc: W3C provenance WG
Subject: [Spam:***** SpamScore] Re: formal semantics strawman

On Aug 25, 2011, at 6:59 PM, Graham Klyne wrote:

> James,
>
> Thanks.  This help to clarify for me some things that weren't clear to me in the model document.
>

Note that the strawman is not necessarily capturing the intent of the model document (it just represents my initial effort to interpret it formally) so might be misleading about what it was trying to say.  For example, Luc asked me offline to change what I was calling "entity" to something else because it doesn't match the model.

I've now updated the document to avoid this potential confusion between the PIDM assertion "entity" and the semantic "?things".  (The question mark is there to flag it as a term with special meaning; we should probably find a less generic term)

> You say at http://www.w3.org/2011/prov/wiki/FormalSemanticsStrawman#Interpreting_an_entity_assertion:
> [[
> Note that there is a design choice here: do we require that the entity associated with id be the same throughout the interval or not? I have chosen to require this, since otherwise the entity assertion doesn't seem to be about a "single entity across a time interval". Of course, if we require that the mapping from URIs to entities be time-invariant then this problem goes away.
> ]]
>
> As far as I can tell from a quick skim, everything else works as intended (at least in sections 1.3, 1.4) if the URI->Entity mapping is invariant.  Which I think leads to a model in which the distinction between resource and entity (which I find to be unhelpful) becomes less significant.
>

I was thinking of a situation where a URI is "retired" and redirected to a different target, e.g. example.com merges with example2.com, and http://www.example2.com is redirected to example.com's website from then on.  Perhaps in this example http://www.example2.com is by definition not a URI.

I think assuming that lookup is time-insensitive would be reasonable (and would definitely simplify some of the definitions), but wanted to highlight the design choice since it seems related to things that have been debated on the list.  I'd rather keep things more general now until it's clear that there's consensus about a consistent picture with the other related components.  If it is then obvious that time-variance in the interpretation of URIs is superfluous then it'll be easy to eliminate it.

--James


> #g
> --
>
> On 25/08/2011 18:13, James Cheney wrote:
>> Hi,
>>
>> I've been promising for a while now to write down a short formal semantics strawman to illustrate what I have in mind.  I've put something onto the wiki here:
>>
>> http://www.w3.org/2011/prov/wiki/FormalSemanticsStrawman
>>
>> It's definitely not a finished product but I've made an effort to cover entity assertions, ivp/complement, process execution, and events (but NOT derivation :)
>>
>> One thing that's become apparent already is that there is a large potential for confusion since we are talking about assertions about things that may change over time.  The assertions may explicitly mention time points/intervals and they may also implicitly have "assertion time" or "time intended to be valid"  associated with them. Some of the assertions in the Conceptual Model document also have explicit times associated with them (e.g. use, generation and process execution assertions.)  Others such as entity assertions do not have explicit time arguments, but the discussion surrounding them refers to time points or intervals during which the entity being described exists.
>>
>> So for each kind of assertion p(x,y,z,...), it would be helpful to clarify whether:
>> 1.  p(x,y,z,...) is something that either always holds or never holds; or
>> 2.  p(x,y,z,...) can hold or not at a specific point in time t (there may be a convention that we can make this explicit by adding an argument, e.g. p(x,y,z,...,t)); or
>> 3.  p(x,y,z,...) can hold or not during an interval [t1,t2] (again there may be a convention where we add 2 arguments).
>>
>> Currently, there seem to be a mix of conventions.
>>
>> Comments are welcome.  I'm not pretending to have read all the relevant background / mailing list discussion carefully and so I may be using terminology incorrectly.  As the name suggests, I expect this to be easy to knock down, but hope that we'll learn something in doing so anyway.
>>
>> --James
>
>


--
The University of Edinburgh is a charitable body, registered in
Scotland, with registration number SC005336.
Received on Saturday, 27 August 2011 16:57:25 UTC