Some thoughts about the revised provenance Model document

I apologize that this has taken so long for me to assemble since last week's 
teleconference - I've been juggling a lot of tasks, and this one got pushed 
aside.  I don't think anything I say here fundamentally changes the direction of 
travel, but there are some comments I have concerning the revised Provenance 
Model document (which, to repeat myself, I think is a big step forwards).

My comments fall into three areas:

1. Terminology: characterized things and entities
2. complementOf / IVPof relations between things
3. Do we need to model "Characterizing attributes"?

...

1. Terminology: Characterized things, Entities and Entity expressions

The terminology and explanation of these concepts is now much clearer to me.

Referring to the document at 
http://dvcs.w3.org/hg/prov/raw-file/tip/model/ProvenanceModel.html (as retrieved 
on 2001-09-29) I note that
- Fragment #a-conceptualization-of-the-world describes the notions of "things in 
the world" and "characterized thing"
- Fragment #prov-dm-overview has a diagram with a box labelled "Entity", which 
to my mind corresponds to the "characterized thing" mentioned previously.
- Fragment #expression-Entity introduces an "entity expression" which is the ASN 
construct that describes an "identifiable characterized thing"

These are all very helpful clarifications over what we had previously, and I'm 
particularly pleased that there is clarity here about the distinction between an 
ASN language construct and the thing it describes.

My comment is this:  now that "entity expression" is used to refer to the 
language construct, the term "entity" can unambiguously refer to the thing 
described by that construct; i.e. the characterized thing.  Indeed, this is 
already suggested by the diagram at #prov-dm-overview.

Then, I would say that in the entity expression:

   entity(e0, [ type="File", location="/shared/crime.txt", creator="Alice" ])

the identifier e0 *denotes* the described *entity* (which we read is a file at 
location "/shared/crime.txt" created by Alice).

and from the expression and assertions from the surrounding story:

   entity(e2, [ type="File", location="/shared/crime.txt", creator="Alice",
                content="There was a lot of crime in London last month."])

we can say that identifier e2 denotes another entity that is a particular view 
of the entity e0, which happens to contain the text "There was a lot of crime in 
London last month."

I've somewhat laboured this, but what I'm suggesting is that the concept 
"entity" can be used to mean what has been introduced as a "characterized thing".

...

2. complementOf / IVPof relations between things

This topic has been discussed on the mailing list over the past week, and I just 
wish to add my voice to those who see a useful role for something like "IVPof" 
(even though I don't especially like that term), and separately that the term 
"complementOf" is somewhat unhelpful, if not actually confusing to read.

Concerning IVPof; I see this as being useful as a primitive relation that can be 
*asserted* between entities; e.g. to assert that e2 is an IVPof e0.

Which leads to...

...

3. Do we need to model "Characterizing attributes"?

The notions of "characterizing attributes" have developed to derive the 
relationship between different entities that are views of some common thing in 
the world.  I am not convinced that we need to model these attributes, and I'm 
not sure the way they are modelled can necessarily apply in all situations that 
applications might wish to represent.

At heart:  when it comes to exchanging provenance information, why do we *need* 
to know exactly what makes one entity a constrained view of another?  What 
breaks (at the level of exchanging provenance information) if we have no access 
to such information?  How are applications that exchange provenance information 
about entities for which they don't actually know about these attributes to know 
about their correspondences with real-world things?

I think the role of attributes here is mainly to *explain* some aspects of the 
provenance model, but they do not need to be part of the model.

To my mind, a simpler approach would be to allow for assertion of an IVPof type 
of relationship between entities, from which some useful inferences about any 
attributes present might flow, but I don't see the need for the attributes to be 
in any sense defining of the entities.

<aside>
My suggested definition of IVPof might be something like this:

   A IVPof B  iff  forall p : (Entity -> Bool) . p(B) => p(A)

where A, B are Entities, and the values of p are predicates on Entities.
</aside>

...

#g

Received on Thursday, 29 September 2011 10:50:17 UTC