W3C home > Mailing lists > Public > public-prov-wg@w3.org > September 2011

Re: Issue 89 - why?

From: Graham Klyne <GK@ninebynine.org>
Date: Sun, 18 Sep 2011 11:20:23 +0100
Message-ID: <4E75C5E7.1060302@ninebynine.org>
To: "Myers, Jim" <MYERSJ4@rpi.edu>
CC: W3C provenance WG <public-prov-wg@w3.org>

On 17/09/2011 16:15, Myers, Jim wrote:
> Are you asking whether we need to distinguish between something and 'something that can't change in some ways' to unambiguously record provenance, or just whether frozen attributes is the best way to do that?
> If we don't distinguish at all, we have a mess - a document and a version can't be distinguished if we can't talk about fixed content and we'd then be unable to answer questions about when the document was created (with the first version or only when the text was finalized).

Agreed, we need to be able to distinguish between the document and its 
"versions" for which some values about which we make provenance assertions are 

> (This is the problem with things - we don't always agree on what aspects of a thing can change and still be recognizable as the same thing, so we define entities for which the aspects that important relative to the provenance we're recording are clearly changeable or not changeable, not open to interpretation).
> If we consider the alternatives to fixing attributes, the most obvious would be to stick the constraint in the type/class - as we do with document and document-version. Either works, but you end up with a lot of type proliferation. 'document-version<#>-at-location<>-inEncoding<>-withEncryption<>' is well defined relative to moving, encoding and encryption changes, etc. The alternative encoding is to fix the attributes. To me, the interpretation should be the same in both cases - a version is really a different kind of thing than a document even if we record it as document with a  fixed content attribute. (The statue and other examples make this clearer).

I take a view that something may be a "version" of something else if it is 
asserted to be (*).  The important consequence of being such a "version" is that 
valid provenance assertions made with respect to these versions are permanent 
truths, and can they can be said to be about some aspect of the original 
resource.  Beyond that, why do we need to know what are the particular 
constraints for a particular "version"?

I guess I'm trying to dodge the philosophical minefields about what constitutes 
identity.  I'm more concerned with what we need as a minimum to be able to 
record, exchange and do useful things with provenance information.

It could be that I'm missing something important here, hence my original 
question being phrased as "what breaks?"


You also raise what I see as a separate issue:  "a version is really a different 
kind of thing than a document".  In some senses, this is almost tautologically 
true, but from a perspective of ontologizing, I'm not sure it's useful.  Can 
versions have versions (I think so).  Then we are faced with a potentially 
infinite regress of types, or a type that can be reflexive (if that's an 
allowable use) with respect to the version relationship; i.e. a type that can be 
both range and domain of a "has version".  To me, the latter seems to be the 
simpler course, unless and until we find some essential functionality that is 
broken in such an approach.


(*) of course, it may be of interest to others to understand what makes 
something a "version" of something else, and to understand the variant and 
invariant elements in detail.  I'm just asking if this needs to be part of the 
_provenance_ discussion, or if it can be treated separately.

For example, OPMV avoids this whole issue by saying that the things to which 
provenance are applied are static [1].  This is enough for OPMV to be useful in 
a significant range of applications for provenance (I understand it is used in 
the current UK open gov data work).  I personally think that might be too strong 
a constraint, but if the price of relaxing that constraint is to wade into 
difficult philosophical territory, them I'm not so sure it's worth it.

The fact that the things OPMV describes may be different versions of some 
underlying thing is simply not part of this particular ontology, and it seems to 
work OK so far.

[1] http://open-biomed.sourceforge.net/opmv/ns.html#sec-specification - see 
sub-section on "Artifact"


>> -----Original Message-----
>> From: public-prov-wg-request@w3.org [mailto:public-prov-wg-
>> request@w3.org] On Behalf Of Graham Klyne
>> Sent: Saturday, September 17, 2011 3:07 AM
>> To: W3C provenance WG
>> Subject: Issue 89 - why?
>> I've been reading some of the discussion of Issue 89:
>>     http://www.w3.org/2011/prov/track/issues/89
>> which seems to my mind be getting rather like a counting of angels-on-
>> pinheads, and I wonder if we're not in danger of over-ontologizing here.
>> Going back to the original issue, I see:
>> [[
>> The conceptual model defines an entity in terms of an identifier and a list of
>> attribute-value pairs. It is indeed crucial for the asserter to identify the
>> attributes that have been frozen in a given entity.
>> ]]
>> Why is it so crucial to identify what attributes have been frozen?
>> What practical application of provenance is prevented is we don't require
>> this?
>> #g
>> --
Received on Sunday, 18 September 2011 10:28:24 UTC

This archive was generated by hypermail 2.4.0 : Friday, 17 January 2020 16:51:00 UTC