Definitions and provenance and invariance

From: Graham Klyne <GK@ninebynine.org>
Date: Tue, 14 Jun 2011 12:09:15 +0100
To: W3C provenance WG <public-prov-wg@w3.org>
I'm getting a distinct feeling that the reductionist focus on trying to define 
terms in isolation is not helping us move towards a useful consensus.  I feel it 
is tending to force our attention to matters that are not important and where we 
might reasonably have different views, rather than on those matters where we 
already are pretty much agreed. I'll try and explain why, then I'll offer a an 
alternative approach.

The problem:

The more precisely one tries to define a concept, the more there is in the 
definition to disagree with, or the fewer real-world entities actually conform 
to that definition.  In model theoretic formalisms, one can never completely 
constrain the definition of a term to limit the satisfying interpretations to 
just one possible referent for that term (cf. "The chief utility of a formal 
semantic theory is not to provide any deep analysis of the nature of the things 
being described by the language [...] but rather to provide a technical way to 
determine when inference processes are valid" -- Pat Hayes, RDF Model Theory, 

In the world of ontologies, it is the simple, small ontologies that say very 
little, and leave little room for disagreement, that are widely used (FOAF, 
SiOC, DC, VOID, etc.).  (There are exceptions, such as the Gene Ontology family, 
but the difference here is that such ontologies are being used within a 
community to encode a substantial body of evolving domain knowledge.)

In natural language (which we are are using to create our definitions), W V O 
Quine compellingly argues (in at least one of his essays in "Ontological 
Relativity") that it is not possible to constrain meanings for individual terms 
in ways that allow for correct assessment of the truth of any sentence, and that 
the role of words does not necessarily map one-to-one between languages that 
have comparable expressive power (Quine describes the role of number words in 
western languages and Japanese).  But what we can do more easily is agree (or 
not) about the truth of complete sentences.  (As I write this, I don't have my 
copy of Ontological Relativity to hand, so am relying on memory for the references.)


What I propose, and I think it parallels a thought that Jim has already 
expressed (http://lists.w3.org/Archives/Public/public-prov-wg/2011Jun/0015.html, 
and elsewhere), is that we look at a minimal model of related provenance 
concepts, and agree something about the combined meanings of the concepts and 
their relationships.  For the purposes of exposition, I shall focus on 
time-varying properties, but I believe the approach can generalize to any 
variation in a resource's property.

My core structure is:

   Dynamic resource
     v has view
   View resource
     v has provenance
   Provenance resource

Where the possible sets of differently labelled resources are not disjoint.  I 
think the key criterion that we are trying to express is that the relation has 
provenance carries a requirement of invariance between the view resource and the 
provenance resource.

Suppose that the "Dynamic resource has a number of different observable 
properties, some of which do not change over time, and others which do.  Then 
the View resource would be a resource for with a similar set of properties such 
that do not change over time, but correspond to the dynamic resource properties 
at a given time (including properties that do not change over time).  If the 
Dynamic resource does not change over time, then it may also serve as its own 
view resource:  the has view property can be reflexive.

The provenance resource is an assertion about the properties of the view 
resource.  I believe the key requirement that we try to capture is that the 
properties about which the provenance resource makes assertions are invariant - 
there is no assertion in the provenance resource which is not always true of the 
view resource.


This could (and should) be cast in more mathematical terms (e.g. resource 
properties as functions of time t), but I think it would be quite easy to 
formally express the required constraints and I'll skip doing so in this email.

In writing this, I think it reflects quite closely what Luc has been describing 
through IVPTs, or whatever, but in in considering the different resources and 
relationships between them I find it much easier to focus on and express what (I 
think) is important.

