Provenance model, syntax and formal properties

I've been thinking about my review yesterday of the "wrong" PROV-DM WD4 
document, and my subsequent scan of the newer version of Part 1, which leads me 
to question whether we've got the right breakdown.

TL;DR: consider a breakdown of provenance model into just two parts:
(1) provenance model, covering concepts model and functional notation syntax
(2) formal properties, covering assumed constraints and inferences and 
provenance of changeable things

...

I realize that in yesterday's telecon, we resolved to split the document into 
three parts, but in the light of a better understanding of what we're defining 
and what we're proposing, I want to reopen the question whether we've chosen the 
right breakdown.  I also have some concerns about terminology.

I probably missed something, but my original expectation following the F2F was 
that the DM document would be separated into *two* parts: one providing a 
concise description of the model and the concepts, and the second dealing with 
its formal properties.  What I think we agreed yesterday was *three* parts: the 
"model", "constraints" and "abstract syntax".


First the  terminology:

I don't think the terms "abstract syntax" and  "constraints" are quite right. 
The "abstract syntax" is not really an abstract syntax, but a concrete syntax 
for a functional notation for provenance - so why don't we call it a "functional 
notation" (like OWL has)?

My other problem is the characterization of the formal properties as 
"constraints";  it's true that the inferences that we want provenance to support 
come about because of the constraints the expressions are assumed to obey, but 
to characterize the whole topic as "constraints" is, I think to miss the main 
point.  I think what we're really describing is a provenance algebra.


And so to the substantive matter of how to make the required simplifications:

Reflecting on my review of the older DM document, none of the problems of 
difficulty or complexity that I encountered were to do with the syntax.

Indeed, I think that, in many ways, the syntax actually helps to make clear what 
are the constituents of each provenance record, as long as it's handled in a way 
that doesn't get bogged down in lexical minutiae, in which I think the original 
handling of syntax did quite well.  Further, the functional notation syntax 
gives us a concise notation for talking about provenance records, and in 
particular for presenting examples.  It seems strange to me that the proposed 
"Part 1" document contains examples that use the syntax without actually 
defining it.  I think the syntax productions could be included in the new 
PROV-DM core section without significantly complicating it.

So my suggestion is to consider a breakdown into *two* parts:
(1) provenance model, covering concepts model and functional notation syntax
(2) provenance algebra, or formal properties, covering assumed constraints and 
inferences, and in particular opening up the issues of how to handle provenance 
for dynamic resources.  The formal properties described here are underpinned by 
the formal semantics (model theory) document.

#g

Received on Friday, 24 February 2012 05:55:46 UTC