Re: Noah Mendelsohn Comments on July 26 Draft of TAG Versioning Finding

I just went through David's finding - wow, that was a *lot* of work,and
I think David did an admirable job with a host of difficult issues. Yet,
I think to a large extent *both* David and Noah are both right (as I
explain in the next paragraph), although David is thinking more inline
with philosophers and Noah more inline with compsci.  However, I'd like
to make one suggestion: While I think the intuitions of the TAG are
right and very informative, combining a *formalization of these
intuitions* and make a usable and readable document that your average
corporation or even RDF/XML hacker can use might be two separate goals,
and perhaps the "Versioning Primer" should be given as a separate
document from "Versioning Semantics". I think as a good introduction
most of David's document should stay as is, and the troublesome bits
should be further hashed out in a perhaps formal document (the
"Versioning Semantics" or something).

>From a theoretical computer science, they usually start with a notion of
an alphabet, which is finite set of symbols. Then you have "strings"
which are a finite set of symbols from the alphabet, and a language is
any set of strings over the alphabet - and this means a language can
contain an *infinite number* of strings.  This is important, since you
then need to define the language using a grammar (ala constraints), like
production rules, regular expressions, etc. etc.
So - Noah and Dan are sort of both right. Languages are usually thought
of "as sets of strings" but since those sets are infinite, we need a
finite representation in terms of some constraints. Just like an XML
Schema can validate (sort of define) an infinite amount of actual XML 

I sort of like the text->information division, even though it's a bit
untraditional and I think you're trying to get at something a bit
different than the language->interpretation/denotation division
traditionally made in semantics and so this new terminology has to be
*carefully* connected to semantics or normalized with terminlogy done in
logic/math/everywhere else/ as Pat pointed out. I'll give it a go at
some point, but then again - this sort of connections might be possible
to do in a separate document.  Using terms "information" and
"constraints" is though not unheard of, it sounds very like "situation
semantics " in flavor [1], and it might take some real work to map this
out formally. There are still a few live situation semanticists floating
about, but none of them are very Webby. Robin Cooper comes to mind[2].


David Orchard wrote:
> Noah,
> Part 1 of ? Parts.
> I've gone through your comments.  Thanks again for doing some extensive
> reviewing.  The comments that you made did not substantially conflict
> with much of the work that I had done, which is goodness.  
> I'm going to respond by quoting sections that you wrote followed by my
> comments.  I hope that's the best way to work through the comments.
> NM>>>
> The finding claims that constraints are part of the language. I'm not
> convinced that's a good formulation, since the constraints are embodied
> in the set of texts & mappings. Stated differently, I think we're
> confusing a "language" with "the specification of a lanuage", and those
> are very different. So, I think a language should be a set of texts and
> their interpretation as information, and I am very happy with the way
> you present that much. 
> I think we should have separate sections that talk about managing the
> specifications for languages as they evolve, and certainly constraint
> languages like XML Schema are among the good tools for writing
> specifications. It's OK to talk about keeping a language and its
> specification in sync. and to talk about constraint language features
> that facilitate versioning. I don't think the constraints are the
> language. I think they are emergent properties of the language that can
> sometimes be usefully set down in mathematical and/or machine readable
> notation such as regex's or XML Schemas. This is an important
> distinction on which I disagree with the finding as drafted 
> <<
> I'm not pursuaded that a language doesn't include constraints on the
> language.  I think the key part is that the set of texts may be
> determined by the constraints.  Using one of your favourite examples, if
> I create a language that has Red,Green,Blue.  There we have listed the
> texts.  
> But one of my favourite examples is the Name language, which has given
> and family, and those are simply strings.  Whether Aaaaa and Aaaa and
> Aaa and Aa are part of the language didn't even occur to me until I
> wrote this.  When a processor determines whether a text is in the
> language, it doesn't generate all the texts "in hand" and then compare,
> it will look at the constraints and evaluate without having all the
> texts in hand.  I think any constraints are fundamentally part of the
> language.  
> It seems to me that some languages, membership is determined by having
> the set of texts, and in others the set of texts can be generated from
> the constraints.  So, can we come up with a modelling mechanism that
> allows a language to refer to one thing, rather than the 2 that it
> currently does (texts and constraints).   Perhaps this was what the
> "membership" bucket was an attempt to model.
> Now I could flip this around and suggest we should go the opposite way
> and suggest removing Text Set and Information Set: languages have
> semantics, syntax and texts are in a language if they meet the syntax.
> Languages also have a mapping between any individual text and an
> individual information "item".  
> I thought about breaking the relationship between language and syntax,
> leaving syntax just connected to text set.  If I squint hard enough, I
> can see that could work.  But I think that doesn't pass the common view
> of language, which is that language is directly related to syntax rather
> than indirectly via text set, see my name syntax example.  
> NM>>
> I think we can and should do better in telling a story about whether a
> particular text is compatible as interpreted in L1 or L2, vs. the senses
> in which languages L1 and L2 as a whole are compatible. I think the
> story I would tell would be along the lines of: 
> Of a particular text written per language L1 and interpreted per
> language L2: "Let I1 be the information conveyed by Text T per language
> L1. Text T is "fully compatible" with language L2 if and only if when
> interpreted per language L2 to yield I2, I1 is the same as I2. Text T is
> "incompatible" if any of the information in I2 is wrong (I.e. was not
> present in I1 or replaces a value in I1 with a different one...this
> rules disallows additional information, because only the information in
> I1 is what the sender thought they were conveying, so anything else is
> at best correct accidently). There are also intermediate notions of
> compatibility: e.g. it may be that all of the information in I2 is
> correct, but that I2 is a subset of I1. [Not sure whether we should name
> some of these intermediate flavors, but if we do, they should be defined
> precisely.]
> Of languages L1 and L2: We say that language L2 is "fully backward
> compatible" with L1 if every text in L1 is fully compatible with L2. We
> say that language L1 is "backwards incompatible" with L2 if any text in
> L1 is incompatible with L2. We say that Language L1 is "fully forwards
> compatible" with L2 if every text in L2 is fully compatible with L1. We
> say that L2 is "forwards incompatible with L1" if any text in L2 is
> incompatible with L1. As with texts, there may be intermediate notions
> of langauge compatibility for which we do not [or maybe we should?]
> provide names here.
> That all seems pretty simple and clean to me, and I think it's a firm
> foundation for much of the rest of the analysis. Notice that it seems
> natural to leave out discussion of the constraints in this layer; the
> story gets simpler without them. The current draft seems to me a bit
> loose in both talking about and defining issues for languages as a whole
> vs. for individual texts. 
> <<
> I have been moving towards this space as well in my examination of
> partial understanding.  It is yet another example of this "class" vs
> "instance" that seems to always come up in modeling and system design.  
> But I still disagree with the removal of syntax.  I could easily suggest
> somewhat alternate wording that makes use of syntax and makes sense to
> me.  "Of languages L1 and L2: We say that language L2 is "fully backward
> compatible" with L1 if every text valid under L1's constraints is fully
> compatible with L2".  I could even push it further and define the
> Syntactic constraints, S1 and S2, then rephrase as "We say that language
> L2 is "fully backward compatible" with L1 if every text valid under S1
> ..."
> NM>>
> Clarify focus on texts vs. documents
> ...
> <<
> I agree.  I've inserted part of one of your paragraphs.
> My comments on your comments ends, 
>> -----Original Message-----
>> From: [] 
>> Sent: Monday, August 28, 2006 4:22 PM
>> To: David Orchard
>> Cc:
>> Subject: Noah Mendelsohn Comments on July 26 Draft of TAG 
>> Versioning Finding
>> First of all, thanks again to Dave for the truly heroic work 
>> on the versioning finding.  This problem is as tough as they 
>> get IMO, and I think the drafts are making really steady 
>> progress.  Still, as I've mentioned on a number of 
>> teleconferences, I have a number of concerns regarding the 
>> conceptual layering in the draft versioning finding, and some 
>> suggestions that I think will make it cleaner and more 
>> effective.  Dan Connolly made the very good point that it is 
>> really only approriate to raise such concerns in the context 
>> of a detailed review of what has already been 
>> drafted.   So, I've tried to do that. 
>> A copy of my annotated version of the July 26 draft is 
>> attached.  I've taken quite a bit of trouble over these 
>> comments, which are quite extensive, and while I'm sure that 
>> they will prove to be only partly on the right track, I hope 
>> they will get a detailed review not just from Dave 
>> but also from other concerned TAG members.   Anyway, what 
>> I've done is to 
>> take Dave's July 26th draft and add comments marked up using 
>> CSS highlighting.  These are in two main groups:
>> 1) An introductory section sets out some of the main 
>> architectural issues and ideas that I've been trying to 
>> convey.  I don't expect these will seem entirely justified 
>> until you read the rest of the comments (if then), but I 
>> think it's important to collect the significant ideas, and to 
>> separate them from the smaller editorial suggestions.
>> 2) I've gone through about the first third of Dave's draft, 
>> inserting detailed comments.  Some of these are purely 
>> editorial, but most of them are aimed at motivating and 
>> highlighting the concerns that led me to propose the major 
>> points in that introductory chapter.  Indeed, I've tried to 
>> hyperlink back from the running comments to the larger 
>> points, as I think that helps to motivate them.
>> No editor working on a large draft entirely welcomes 
>> voluminous comments, especially ones that have structural 
>> implications.  Dave:  I truly hope this is ultimately useful, 
>> and I look forward to working with you on it. 
>> Where possible, I've tried to suggest text fragments you can 
>> steal if you like them.  I actually am fairly excited, 
>> because working through Dave's draft has helped me to 
>> crystalize a number of things about versioning in my own 
>> mind.  I think we're well along to telling a story that's 
>> very clean, very nicely layered, and perhaps a bit simpler 
>> and shorter than the current draft suggests.  I don't think 
>> it involves throwing out vast swaths of what Dave has 
>> drafted, so much as cleaning up and very carefully relayering 
>> some concepts. 
>> BTW: I will be around on and off until about Wed. afternoon, 
>> then gone 
>> until after US Labor Day weekend.   Thanks again, Dave.  
>> Really nice work!
>> Noah
>> [1]
>> --------------------------------------
>> Noah Mendelsohn
>> IBM Corporation
>> One Rogers Street
>> Cambridge, MA 02142
>> 1-617-693-4036
>> --------------------------------------


Harry Halpin,  University of Edinburgh 6B522426

Received on Friday, 8 September 2006 15:31:18 UTC