W3C home > Mailing lists > Public > public-prov-wg@w3.org > September 2011

Re: Issue 89 - why?

From: Graham Klyne <GK@ninebynine.org>
Date: Sun, 18 Sep 2011 22:16:43 +0100
Message-ID: <4E765FBB.9060002@ninebynine.org>
To: Satya Sahoo <satya.sahoo@case.edu>
CC: "Myers, Jim" <MYERSJ4@rpi.edu>, W3C provenance WG <public-prov-wg@w3.org>
On 18/09/2011 19:52, Satya Sahoo wrote:
> Hi Jim and Graham,
>
>> If we don't distinguish at all, we have a mess - a document and a version
>> can't be distinguished if we can't>talk about fixed content and we'd then
>> be unable to answer questions about when the document was>created (with the
>> first version or only when the text was finalized).
>
>
> I believe modeling a document d1 versus modeling versions of document d1v1,
> d2v2 are two distinct notions.

d1, d1v2, d1v2 are surely different things, but I don't see that modelling them 
is fundamentally different.  What sorts of things can one say about d1v1/d1v2 
that one cannot say about d1?  And vice versa?

> The d1v1 and d2v2 are specialized (maybe
> subclass) notions of d1. Also, modeling concepts such as d1v1, d2v2 are not
> required by all provenance applications.
>
>
>> For example, OPMV avoids this whole issue by saying that the things to
> which provenance are applied are>static [1].
> The OPMV has used the original OPM Artifact definition and hence the OPM
> notion of "static" Artifact.

Certainly - my point was that it doesn't prevent one from describing (say) d1v1, 
d1v2, etc. and also separately saying that they arr "versions" of d1.  And it's 
a *lot* simpler than the current proposal.

#g
--

> On Sun, Sep 18, 2011 at 6:20 AM, Graham Klyne<GK@ninebynine.org>  wrote:
>
>> Jim,
>>
>>
>> On 17/09/2011 16:15, Myers, Jim wrote:
>>
>>> Are you asking whether we need to distinguish between something and
>>> 'something that can't change in some ways' to unambiguously record
>>> provenance, or just whether frozen attributes is the best way to do that?
>>>
>>> If we don't distinguish at all, we have a mess - a document and a version
>>> can't be distinguished if we can't talk about fixed content and we'd then be
>>> unable to answer questions about when the document was created (with the
>>> first version or only when the text was finalized).
>>>
>>
>> Agreed, we need to be able to distinguish between the document and its
>> "versions" for which some values about which we make provenance assertions
>> are invariant.
>>
>>
>>   (This is the problem with things - we don't always agree on what aspects
>>> of a thing can change and still be recognizable as the same thing, so we
>>> define entities for which the aspects that important relative to the
>>> provenance we're recording are clearly changeable or not changeable, not
>>> open to interpretation).
>>>
>>> If we consider the alternatives to fixing attributes, the most obvious
>>> would be to stick the constraint in the type/class - as we do with document
>>> and document-version. Either works, but you end up with a lot of type
>>> proliferation. 'document-version<#>-at-**location<>-inEncoding<>-**withEncryption<>'
>>> is well defined relative to moving, encoding and encryption changes, etc.
>>> The alternative encoding is to fix the attributes. To me, the interpretation
>>> should be the same in both cases - a version is really a different kind of
>>> thing than a document even if we record it as document with a  fixed content
>>> attribute. (The statue and other examples make this clearer).
>>>
>>
>> I take a view that something may be a "version" of something else if it is
>> asserted to be (*).  The important consequence of being such a "version" is
>> that valid provenance assertions made with respect to these versions are
>> permanent truths, and can they can be said to be about some aspect of the
>> original resource.  Beyond that, why do we need to know what are the
>> particular constraints for a particular "version"?
>>
>> I guess I'm trying to dodge the philosophical minefields about what
>> constitutes identity.  I'm more concerned with what we need as a minimum to
>> be able to record, exchange and do useful things with provenance
>> information.
>>
>> It could be that I'm missing something important here, hence my original
>> question being phrased as "what breaks?"
>>
>> ...
>>
>> You also raise what I see as a separate issue:  "a version is really a
>> different kind of thing than a document".  In some senses, this is almost
>> tautologically true, but from a perspective of ontologizing, I'm not sure
>> it's useful.  Can versions have versions (I think so).  Then we are faced
>> with a potentially infinite regress of types, or a type that can be
>> reflexive (if that's an allowable use) with respect to the version
>> relationship; i.e. a type that can be both range and domain of a "has
>> version".  To me, the latter seems to be the simpler course, unless and
>> until we find some essential functionality that is broken in such an
>> approach.
>>
>> ...
>>
>> (*) of course, it may be of interest to others to understand what makes
>> something a "version" of something else, and to understand the variant and
>> invariant elements in detail.  I'm just asking if this needs to be part of
>> the _provenance_ discussion, or if it can be treated separately.
>>
>> For example, OPMV avoids this whole issue by saying that the things to
>> which provenance are applied are static [1].  This is enough for OPMV to be
>> useful in a significant range of applications for provenance (I understand
>> it is used in the current UK open gov data work).  I personally think that
>> might be too strong a constraint, but if the price of relaxing that
>> constraint is to wade into difficult philosophical territory, them I'm not
>> so sure it's worth it.
>>
>> The fact that the things OPMV describes may be different versions of some
>> underlying thing is simply not part of this particular ontology, and it
>> seems to work OK so far.
>>
>> [1] http://open-biomed.**sourceforge.net/opmv/ns.html#**sec-specification<http://open-biomed.sourceforge.net/opmv/ns.html#sec-specification>- see sub-section on "Artifact"
>>
>> #g
>> --
>>
>>
>>
>>   -----Original Message-----
>>>> From: public-prov-wg-request@w3.org [mailto:public-prov-wg-
>>>> request@w3.org] On Behalf Of Graham Klyne
>>>> Sent: Saturday, September 17, 2011 3:07 AM
>>>> To: W3C provenance WG
>>>> Subject: Issue 89 - why?
>>>>
>>>> I've been reading some of the discussion of Issue 89:
>>>>
>>>>     http://www.w3.org/2011/prov/**track/issues/89<http://www.w3.org/2011/prov/track/issues/89>
>>>>
>>>> which seems to my mind be getting rather like a counting of angels-on-
>>>> pinheads, and I wonder if we're not in danger of over-ontologizing here.
>>>>
>>>> Going back to the original issue, I see:
>>>>
>>>> [[
>>>> The conceptual model defines an entity in terms of an identifier and a
>>>> list of
>>>> attribute-value pairs. It is indeed crucial for the asserter to identify
>>>> the
>>>> attributes that have been frozen in a given entity.
>>>> ]]
>>>>
>>>> Why is it so crucial to identify what attributes have been frozen?
>>>>
>>>> What practical application of provenance is prevented is we don't require
>>>> this?
>>>>
>>>> #g
>>>> --
>>>>
>>>
>>>
>>>
>>>
>>
>
Received on Sunday, 18 September 2011 21:17:41 GMT

This archive was generated by hypermail 2.2.0+W3C-0.50 : Thursday, 26 April 2012 13:06:41 GMT