RE: Noah Mendelsohn Comments on July 26 Draft of TAG Versioning Finding from David Orchard on 2006-09-05 (www-tag@w3.org from September 2006)

From: David Orchard <dorchard@bea.com>
Date: Mon, 4 Sep 2006 17:17:22 -0700
To: <noah_mendelsohn@us.ibm.com>
Cc: <www-tag@w3.org>
Message-ID: <E16EB59B8AEDF445B644617E3C1B3C9C02376156@repbex01.amer.bea.com>
Noah,

Part 1 of ? Parts.

I've gone through your comments.  Thanks again for doing some extensive
reviewing.  The comments that you made did not substantially conflict
with much of the work that I had done, which is goodness.  

I'm going to respond by quoting sections that you wrote followed by my
comments.  I hope that's the best way to work through the comments.

NM>>>
The finding claims that constraints are part of the language. I'm not
convinced that's a good formulation, since the constraints are embodied
in the set of texts & mappings. Stated differently, I think we're
confusing a "language" with "the specification of a lanuage", and those
are very different. So, I think a language should be a set of texts and
their interpretation as information, and I am very happy with the way
you present that much. 

I think we should have separate sections that talk about managing the
specifications for languages as they evolve, and certainly constraint
languages like XML Schema are among the good tools for writing
specifications. It's OK to talk about keeping a language and its
specification in sync. and to talk about constraint language features
that facilitate versioning. I don't think the constraints are the
language. I think they are emergent properties of the language that can
sometimes be usefully set down in mathematical and/or machine readable
notation such as regex's or XML Schemas. This is an important
distinction on which I disagree with the finding as drafted 
<<

I'm not pursuaded that a language doesn't include constraints on the
language.  I think the key part is that the set of texts may be
determined by the constraints.  Using one of your favourite examples, if
I create a language that has Red,Green,Blue.  There we have listed the
texts.  

But one of my favourite examples is the Name language, which has given
and family, and those are simply strings.  Whether Aaaaa and Aaaa and
Aaa and Aa are part of the language didn't even occur to me until I
wrote this.  When a processor determines whether a text is in the
language, it doesn't generate all the texts "in hand" and then compare,
it will look at the constraints and evaluate without having all the
texts in hand.  I think any constraints are fundamentally part of the
language.  

It seems to me that some languages, membership is determined by having
the set of texts, and in others the set of texts can be generated from
the constraints.  So, can we come up with a modelling mechanism that
allows a language to refer to one thing, rather than the 2 that it
currently does (texts and constraints).   Perhaps this was what the
"membership" bucket was an attempt to model.

Now I could flip this around and suggest we should go the opposite way
and suggest removing Text Set and Information Set: languages have
semantics, syntax and texts are in a language if they meet the syntax.
Languages also have a mapping between any individual text and an
individual information "item".  

I thought about breaking the relationship between language and syntax,
leaving syntax just connected to text set.  If I squint hard enough, I
can see that could work.  But I think that doesn't pass the common view
of language, which is that language is directly related to syntax rather
than indirectly via text set, see my name syntax example.  

NM>>
I think we can and should do better in telling a story about whether a
particular text is compatible as interpreted in L1 or L2, vs. the senses
in which languages L1 and L2 as a whole are compatible. I think the
story I would tell would be along the lines of: 

Of a particular text written per language L1 and interpreted per
language L2: "Let I1 be the information conveyed by Text T per language
L1. Text T is "fully compatible" with language L2 if and only if when
interpreted per language L2 to yield I2, I1 is the same as I2. Text T is
"incompatible" if any of the information in I2 is wrong (I.e. was not
present in I1 or replaces a value in I1 with a different one...this
rules disallows additional information, because only the information in
I1 is what the sender thought they were conveying, so anything else is
at best correct accidently). There are also intermediate notions of
compatibility: e.g. it may be that all of the information in I2 is
correct, but that I2 is a subset of I1. [Not sure whether we should name
some of these intermediate flavors, but if we do, they should be defined
precisely.]

Of languages L1 and L2: We say that language L2 is "fully backward
compatible" with L1 if every text in L1 is fully compatible with L2. We
say that language L1 is "backwards incompatible" with L2 if any text in
L1 is incompatible with L2. We say that Language L1 is "fully forwards
compatible" with L2 if every text in L2 is fully compatible with L1. We
say that L2 is "forwards incompatible with L1" if any text in L2 is
incompatible with L1. As with texts, there may be intermediate notions
of langauge compatibility for which we do not [or maybe we should?]
provide names here.

That all seems pretty simple and clean to me, and I think it's a firm
foundation for much of the rest of the analysis. Notice that it seems
natural to leave out discussion of the constraints in this layer; the
story gets simpler without them. The current draft seems to me a bit
loose in both talking about and defining issues for languages as a whole
vs. for individual texts. 
<<

I have been moving towards this space as well in my examination of
partial understanding.  It is yet another example of this "class" vs
"instance" that seems to always come up in modeling and system design.  

But I still disagree with the removal of syntax.  I could easily suggest
somewhat alternate wording that makes use of syntax and makes sense to
me.  "Of languages L1 and L2: We say that language L2 is "fully backward
compatible" with L1 if every text valid under L1's constraints is fully
compatible with L2".  I could even push it further and define the
Syntactic constraints, S1 and S2, then rephrase as "We say that language
L2 is "fully backward compatible" with L1 if every text valid under S1
..."

NM>>
Clarify focus on texts vs. documents
...
<<

I agree.  I've inserted part of one of your paragraphs.

My comments on your comments ends, 



> -----Original Message-----
> From: noah_mendelsohn@us.ibm.com [mailto:noah_mendelsohn@us.ibm.com] 
> Sent: Monday, August 28, 2006 4:22 PM
> To: David Orchard
> Cc: www-tag@w3.org
> Subject: Noah Mendelsohn Comments on July 26 Draft of TAG 
> Versioning Finding
> 
> First of all, thanks again to Dave for the truly heroic work 
> on the versioning finding.  This problem is as tough as they 
> get IMO, and I think the drafts are making really steady 
> progress.  Still, as I've mentioned on a number of 
> teleconferences, I have a number of concerns regarding the 
> conceptual layering in the draft versioning finding, and some 
> suggestions that I think will make it cleaner and more 
> effective.  Dan Connolly made the very good point that it is 
> really only approriate to raise such concerns in the context 
> of a detailed review of what has already been 
> drafted.   So, I've tried to do that. 
> 
> A copy of my annotated version of the July 26 draft is 
> attached.  I've taken quite a bit of trouble over these 
> comments, which are quite extensive, and while I'm sure that 
> they will prove to be only partly on the right track, I hope 
> they will get a detailed review not just from Dave 
> but also from other concerned TAG members.   Anyway, what 
> I've done is to 
> take Dave's July 26th draft and add comments marked up using 
> CSS highlighting.  These are in two main groups:
> 
> 1) An introductory section sets out some of the main 
> architectural issues and ideas that I've been trying to 
> convey.  I don't expect these will seem entirely justified 
> until you read the rest of the comments (if then), but I 
> think it's important to collect the significant ideas, and to 
> separate them from the smaller editorial suggestions.
> 
> 2) I've gone through about the first third of Dave's draft, 
> inserting detailed comments.  Some of these are purely 
> editorial, but most of them are aimed at motivating and 
> highlighting the concerns that led me to propose the major 
> points in that introductory chapter.  Indeed, I've tried to 
> hyperlink back from the running comments to the larger 
> points, as I think that helps to motivate them.
> 
> No editor working on a large draft entirely welcomes 
> voluminous comments, especially ones that have structural 
> implications.  Dave:  I truly hope this is ultimately useful, 
> and I look forward to working with you on it. 
> Where possible, I've tried to suggest text fragments you can 
> steal if you like them.  I actually am fairly excited, 
> because working through Dave's draft has helped me to 
> crystalize a number of things about versioning in my own 
> mind.  I think we're well along to telling a story that's 
> very clean, very nicely layered, and perhaps a bit simpler 
> and shorter than the current draft suggests.  I don't think 
> it involves throwing out vast swaths of what Dave has 
> drafted, so much as cleaning up and very carefully relayering 
> some concepts. 
> 
> BTW: I will be around on and off until about Wed. afternoon, 
> then gone 
> until after US Labor Day weekend.   Thanks again, Dave.  
> Really nice work!
> 
> Noah
> 
> [1] http://www.w3.org/2001/tag/doc/versioning-20060726.html
> 
> 
> 
> --------------------------------------
> Noah Mendelsohn
> IBM Corporation
> One Rogers Street
> Cambridge, MA 02142
> 1-617-693-4036
> --------------------------------------
> 
> 
> 
>
Received on Tuesday, 5 September 2006 00:18:25 UTC