Re: Some thoughts about the revised provenance Model document from Graham Klyne on 2011-09-30 (public-prov-wg@w3.org from September 2011)

From: Graham Klyne <GK@ninebynine.org>
Date: Fri, 30 Sep 2011 10:09:14 +0100
To: Paul Groth <p.t.groth@vu.nl>
CC: 'W3C provenance WG' <public-prov-wg@w3.org>
Message-ID: <4E85873A.3010906@ninebynine.org>
On 30/09/2011 08:33, Paul Groth wrote:
> Hi Graham,
>
> I think the purpose of characteristic attributes is to define what your
> describing the provenance of.
>
> So when I hand you provenance information for example for a web page, you'll
> know that I'm describing the provenance of x attributes of the web page and not
> everything there.

Hmmm... that's not quite how I was seeing this.

If we talk about "the web page modified by Authur", what does it mean to 
describe the provenance of the attribute "modified by Arthur"?  I thank that 
we're talking about the provenance of the particular view/perspective of the web 
page that happens to be modified by Arthur, which may be one of a family of 
pages modified by different editors.

...

I think there are two orthogonal issues here:

(1) given some entity, to make provenance assertions about that entity.  To be 
meaningful, the entity needs to be static (or constrained) to the extent that 
the assertions continue to be truthful in a variety of contexts.  So the entity 
might be "the weather report for Monday" which was modified by Arthur and will 
always remain so, even though Tuesday's version of that weather report may have 
been modified by Brenda.

(2) the other issue is:  given some dynamic entity, how to identify a "view" or 
"perspective" on that entity that has some static characteristics to allow 
meaningful expression of provenance information?  To my way of thinking, 
characterizing attributes are a perfectly reasonable way to do this, but not the 
only way.  Another possibility might be that the entity concerned has a unique 
name, and we can just use that (e.g. the published working versions of W3C 
specifications, URIs of document versions in a version management system).

So while I acknowledge that the attributes might be useful for some applications 
of provenance, I'm not convinced they are needed for all applications.  I also 
acknowledge that the model specification allows that the attributes may be not 
specified.  But I am slightly concerned that there may be situations for 
identifying an entity that don't fit the attribute model, which would formally 
restrict our ability to express provenance information.

...

In the big scheme of things, I think the attributes approach is quite workable 
for any practical example I can think of.  My main objection is that I think 
it's unnecessary for it to be baked in to the whole provenance model, and as 
such runs the risk of making the whole framework more complex than it needs to be.

...

Another way to look at this might be to consider that provenance assertions for 
a given Entity MUST continue be true for that entity, or they are not meaningful 
as provenance assertions.  To the extent that provenance assertions actually 
*are* static attributes of that entity, then the existence of static attributes 
(in the style of "characterizing attributes") may be inferred.  In this respect, 
the static attributes are a consequence rather than a defining aspect of the 
existence of meaningful provenance information.

#g
--

> Graham Klyne wrote:
>> Jim,
>>
>> If I understand you correctly, the significance of attributes is for discovery
>> of of related resources.
>>
>> My understanding is that the primary purpose of provenance is to establish a
>> basis for trust, a reason to believe (or not) some information that is presented
>> about some subject. It's not clear to me what need there is to use attributes
>> for resource discovery to achieve this end. (But I may well be missing
>> something here.)
>>
>> So, on this basis, there may be perfectly good reasons to have defined
>> attributes and values for discovery purposes, I'm not seeing why they are needed
>> to achieve the goals of *provenance* information.
>>
>> (But it's getting late here, and maybe I'm missing some key point in your
>> message.)
>>
>> In summary: I think your concerns are reasonable, but what makes them in scope
>> specifically for *provenance* information?
>>
>> #g
>> --
>>
>> On 29/09/2011 18:44, Myers, Jim wrote:
>>> Graham,
>>>
>>> How would we use provenance to find, for example, how Luc got to Boston? It's
>>> clear if we have fixed attributes for name and location such that we could
>>> query for an entity with name Luc that has an ivpOf relationship with an
>>> entity in Boston and then look at the provenance from there. How would it
>>> work without fixed attributes in the prov model? I'm guessing that you're
>>> thinking that we can find those attributes outside the language somewhere
>>> (e.g. non-prov RDF statements) but what are the minimal requirements there
>>> and what language/models exist that meet them? Can we only model provenance
>>> of things for which ontologies have been developed? Presumably it has to be
>>> possible to associate descriptive metadata with the entities through some
>>> path (what relationship(s)?)? And it has to be clear which metadata is fixed?
>>> You mention being able to infer across ivpOf relationships - is there one set
>>> of inference rules for all possible descriptive metadata? Or do we need to be
>>> able
>
>> to distinguish further between types of metadata?
>>> --> As you can probably guess from the questions above, I'm concerned that
>>> kicking fixed attributes out will end up being more complex and place a
>>> higher burden on users than keeping them in, but I may be misunderstanding
>>> how such an alternative would work. Part of that concern is that I think I
>>> hear that modeling experts in this group can handle defining classes for
>>> different types of entities that would allow discovery by attribute, but I'm
>>> concerned that being able to do this becomes a requirement for using
>>> provenance (versus asserting entities defined solely by attributes(entity,
>>> name=Luc) or perhaps in a mixed mode (e.g. an entity representing Luc that
>>> 'hasBaseType' foaf:person and one representing him in Boston that also
>>> hasBaseType foaf:person and location=Boston as a fixed attribute.) Again -
>>> perhaps I'm misunderstanding how discovery based on descriptive information
>>> could be done if we don't have fixed characterizing attributes in the prov
>>> standard....
>>>
>>> Jim
>>>
>>>> 3. Do we need to model "Characterizing attributes"?
>>>>
>>>> The notions of "characterizing attributes" have developed to derive the
>>>> relationship between different entities that are views of some common
>>>> thing in the world. I am not convinced that we need to model these
>>>> attributes, and I'm not sure the way they are modelled can necessarily apply
>>>> in all situations that applications might wish to represent.
>>>>
>>>> At heart: when it comes to exchanging provenance information, why do we
>>>> *need* to know exactly what makes one entity a constrained view of
>>>> another? What breaks (at the level of exchanging provenance information) if
>>>> we have no access to such information? How are applications that exchange
>>>> provenance information about entities for which they don't actually know
>>>> about these attributes to know about their correspondences with real-world
>>>> things?
>>>>
>>>> I think the role of attributes here is mainly to *explain* some aspects of the
>>>> provenance model, but they do not need to be part of the model.
>>>>
>>>> To my mind, a simpler approach would be to allow for assertion of an IVPof
>>>> type of relationship between entities, from which some useful inferences
>>>> about any attributes present might flow, but I don't see the need for the
>>>> attributes to be in any sense defining of the entities.
>>>>
>>>> <aside>
>>>> My suggested definition of IVPof might be something like this:
>>>>
>>>> A IVPof B iff forall p : (Entity -> Bool) . p(B) => p(A)
>>>>
>>>> where A, B are Entities, and the values of p are predicates on Entities.
>>>> </aside>
>>>>
>>>> ...
>>>>
>>>> #g
>>>>
>>>
>>>
>>
>
Received on Friday, 30 September 2011 09:11:53 UTC