RE: Updated versioning strategies doc [XMLVersioning-41 ISSUE-41] from noah_mendelsohn@us.ibm.com on 2007-11-02 (www-tag@w3.org from November 2007)

From: <noah_mendelsohn@us.ibm.com>
Date: Fri, 2 Nov 2007 16:54:51 -0400
To: "David Orchard" <dorchard@bea.com>
Cc: "Dan Connolly" <connolly@w3.org>, "www-tag" <www-tag@w3.org>
Message-ID: <OFA0AF10D8.49031409-ON85257387.006BADC1-85257387.0072ACD3@lotus.com>
I've gotten several pages in on my review, but not all the way.  Given 
that we're meeting on Monday, I thought it would be most useful to pass on 
the comments I have so far.  These comments reflect a review of the text 
up to the start of section "2.1.1 Identifying Languages".  Also, these are 
just the main points I'd make.  I have some detailed editorial suggestions 
and concerns that I'll try to post later.

* Mostly I agree with the main points that the draft seems to be trying to 
make, and I think that's real progress.  Some exceptions are noted below.

* The introduction assumes too much knowledge of concepts and terminology. 
 For example, in the 2nd paragraph we find: "This finding describes 
general problems and techniques in evolving systems in compatible ways. A 
number of design patterns and rules are discussed with a focus towards 
enabling forwards-compatible changes to languages."   We haven't said a 
thing about what forwards compatible versioning is, and indeed the draft 
doesn't address that until several pages later.   Readers will think 
"should I understand that?".  Better might be:  "This finding describes 
general problems and techniques in evolving systems in compatible ways. A 
number of design patterns and rules are discussed, with a particular focus 
on enabling older versions of applications to operate on inputs that make 
use of newer language features."    The introduction is an important part 
of the document and it shouldn't be confusing to new readers.   I think it 
needs some significant rework to deal with issues like this, but I mostly 
agree with what I think it's trying to see.

* There are many other unexplained forward references to terminology that 
need to be fixed.  In addition to "forwards-compatible" as mentioned 
above, I see unexplained references to "extensible" (in the list of good 
practices in section 1); "schema" (which is not always used in the usual 
sense); "flavors of a schema"; "resources" (which appears to be used for 
something that resembles a consuming application rather than what Web Arch 
calls a resource  -- for example there is a phrase "if a language is 
changed in such a way that all those resources will consider texts of the 
new language invalid"  -- I don't think Web Arch resources typically 
implement the verb "consider"); "our Name example" (the draft says "Recall 
our Name example and consider... ", but in fact the Name example has been 
neither introduced nor referenced); and "component" (in the name example). 
  All this unexplained terminology seems to undercut the rigor of the 
presentation, without making it particularly easier to read or more 
accessible.  It's just confusing. 

* Section 1: "if the texts of the language contain version identifiers, 
then texts where the version identifier is unknown can be treated as if 
the version identifier was known."  This has a number of problems I think. 
 First of all it presupposes that such identifiers are optional.  One 
approach is to make them mandatory, in which case you don't need to say 
what to do if they're missing, other than to say the document is not 
legal.  Even if they are optional, it's not at all clear that you need to 
treat the document as if it were of some other version that could have 
been specifically identified.  I think a better rule would be something 
like:  "If a language allows for version identifiers in document text, and 
if use of those identifiers is optional, then the language must provide 
for the interpretation of documents in which the version identifier is 
absent."   I think that covers all the cases, and more accurately. 

* Section 1.2: "Just Names: some languages don't actually have a syntax or 
grammar; they're just lists of names."  I think we need to very carefully 
introduce the sense in which we're using the words "syntax" and "grammar" 
before saying this.  My understanding is that historically, syntax has 
indeed referred to the structure of words in sentences, etc., and you are 
right that a simple list of words doesn't have syntax in that sense.  On 
the other hand, one can find references in the computer field which seem 
to suggest using syntax to refer to the legal forms of text input for some 
system.  In that sense, languages that just allow names, or lists of names 
have a syntax, albeit a simple one.  The grammer is just the degenerate "a 
token" or "a space separated list of tokens" (BTW: you should be clearer 
as to whether you're talking about languages in which each document 
contains a single name, but you get to choose it from a list, or whether 
you mean a language in which each instance is a list.   I couldn't tell.) 
For our purposes, I think I'd prefer to take the approach that all 
languages have syntax, though for some lanuage the structure of legal 
documents is simple to the point of being trivial;  for such languages, 
the grammers are correspondingly simple.

I hope this is helpful. 

Noah

--------------------------------------
Noah Mendelsohn 
IBM Corporation
One Rogers Street
Cambridge, MA 02142
1-617-693-4036
--------------------------------------








"David Orchard" <dorchard@bea.com>
10/26/2007 01:54 PM
 
        To:     <noah_mendelsohn@us.ibm.com>, "Dan Connolly" 
<connolly@w3.org>
        cc:     "www-tag" <www-tag@w3.org>
        Subject:        RE: Updated versioning strategies doc 
[XMLVersioning-41  ISSUE-41]


Righto. 

I did an update, now available at 
http://www.w3.org/2001/tag/doc/versioning-compatibility-strategies
http://www.w3.org/2001/tag/doc/versioning-compatibility-strategies-20071026.html



Comments inline.

Cheers,
Dave

> -----Original Message-----
> From: www-tag-request@w3.org [mailto:www-tag-request@w3.org] 
> On Behalf Of noah_mendelsohn@us.ibm.com
> Sent: Tuesday, October 16, 2007 6:49 PM
> To: Dan Connolly
> Cc: www-tag
> Subject: Re: Updated versioning strategies doc 
> [XMLVersioning-41 ISSUE-41]
> 
> 
> Sorry it's taken awhile for me to get back to this.   I think 
> this is a 
> step forward, in that it is at least clear and succint in 
> setting out a few key points.  As to whether I think these 
> are the right points to make that's a tougher call:
> 
> > This finding describes general problems and techniques in evolving 
> > systems in compatible ways. These techniques are designed to allow 
> > compatible changes.
> 
> OK. 
> 
> The next comment is essentially editorial:
> 
> > A number of design patterns and rules are discussed with a focus 
> > towards enabling versioning.
> 
> I'm not sure that versioning is a well defined noun when used 
> in this way. 
>  In fact, would it make sense to skip, the 2nd sentence, leaving:
> 
> "This finding describes general problems and techniques in 
> evolving systems in compatible ways. A number of design 
> patterns and rules are discussed with a focus towards 
> enabling compatible changes (to languages?)."?
> 
> > There are a few crucial good practices that enable forwards 
> compatible 
> > versioning from version 1 of a language.
> 
> Hmm.  Just jumped to forward compatibility, when a minute ago 
> you said "general problems in evolving systems in compatible 
> ways".  That surely goes beyond forward compatibility? 
> 

True.  I reduced the scope to forwards compatibility.

> > There are a few crucial good practices that enable forwards 
> compatible 
> > versioning from version 1 of a language.
> > 
> >   The first is to specify the language is extensible.
> 
> Not quite sure what this means.  First of all, I suspect it 
> should be: 
> "The first is to specify >that< the language is extensible.", 
> but even then, what is that adding, except as an 
> introduction?  It's the syntax and semantics of langauge that 
> do or don't make it extensible, not the fact 
> that you say it is.   I don't think you specify that the language is 
> extensible;  you might instead note that the language is extensible. 
> Extensibility is an emergent property of the design.
> 
> > The second is to specify that any text of the language with 
> extensions 
> > can be treated as if the extensions were not present.
> 
> This is the one where I have a significant problem.  I think 
> when you look at many extensible languages they do not 
> necessarily provide that what we're calling extensions are 
> completely ignored in by versions.  I've come to believe that 
> what many important languages do is to provide default 
> processing rules. 
> 
> So you might ask: in what sense is extension going on if the 
> syntax already has >some< semantic in V1?  Well, what tends 
> to happen is that in
> V2 the generic processing rule is replaced by something 
> specific. Consider a simple language for transferring named 
> properties, and apply it in particular to names of people. 
> The specification for V1 might say:
> 
> V1 Specification:  Properties are transmitted one per line, 
> as a space separated pair, with the name of the property 
> first, and its value second. 
>  In V1 of the language, the following properties are given special
> significance:  firstname is used for a person's given name, 
> and lastname is used for his or her family name.  All 
> properties are considered part of the name, whether or not 
> they are given particular meaning in this version of the 
> specification.  So, for example, an application storing the 
> name SHOULD record all properties, not just these two, etc. 
> Note that in future versions of this specification, 
> additional properties may be given particular significance.
> 
> V2 Specification:  The V2 language is the same as V1, except 
> that the property middlename is now understood to be the 
> person's middle name.
> 
> I think this is a fine example of a forwards compatible 
> specification. 
> People do this stuff all the time, but the substitution-based 
> approach doesn't say anything about it.  RFC 2616 Section 
> 5.3, for example, specifies that: "Unrecognized header fields 
> are treated as entity-header fields." and "Request-header 
> field names can be extended reliably only in combination with 
> a change in the protocol version."  So, there's a default 
> behavior in HTTP 1.1 (treat as entity-headers), and a warning 
> that for some headers future versions might supply a more 
> distinguished semantic. 
> I'm pretty sure a similar case could be made regarding the 
> default relay semantic for SOAP headers, but I don't have 
> time to remind myself of the pertinent details this evening.
> 
> As I say, I think this sort of thing is common and extremely 
> valuable.  I really would like our finding to tell a story 
> about it, and I worry that "treat as if the extensions 
> weren't present" is really just a special case.  I think we 
> should probably start with the "default semantic", and then 
> say how one example of a default semantic is "treat as if it 
> weren't there at all".

Right.  This reminds me of a conversation I'm having with Ian Hickson
about the difference between an error that is recovered nicely from
versus part of the language design with nice recovery from extra things.


I can live "ignoring" being a special case, and the later parts of the
finding do exactly that.  In fact, the later part say that "accept and
throw away", "accept and retain" are both different flavours of must
accept rules. 

We are quite in agreement that part of forwards compatibility is
allowing extra things (bullet #1) and specifying that those extra things
have a processing model that should not be failure (#2). 

I've tried to reword the 2nd item into "any extensions in a text of the
language have a well-defined meaning that at a minimum is that the
extensions are acceptable;"

> 
> > The third is to specify an algorithm for how a text of the language 
> > with a version identifier that is unknown can be treated as if the 
> > version identifier was known.
> 
> Why are we limiting ourselves to languages in which the 
> version used is to be explicitly signaled in band in the 
> text?  Many, many languages (FORTRAN, at least some flavors 
> of C, Java and many other programming languages come to mind) 
> never answer explicitly the question: "which versions of the 
> spec. did I have in mind when I wrote this?", and for those 
> languages step 3 doesn't apply.  In fact, when you see a text 
> that's valid per some version, you can never be 100% sure 
> whether the author intentionally authored to that version, or 
> accidentally wrote something legal (I.e. made a mistake but 
> happened to create correct new version
> syntax) while in fact reading an earlier version of the 
> specification. 
> Version id's tend to be a cross check against such things, 
> IMO, except when a language has evolved in truly incompatible 
> ways.  Then it's really important to say in the document "you 
> better interpret this per V5, or else what you conclude may be wrong!"
> 
> So, I don't think we should limit ourselves to languages with 
> in band version signaling.  If we do make such a limiting 
> assumption, we should say so rather explicitly before making 
> the third statement.  E.g. "This analysis limits itself to 
> the special case of languages that can signal within each 
> text the version(s) of the language to which the document was 
> authored.  For such languages, the third step is to specify 
> an algorithm...."

I never had the intension of limiting ourselves to languages that have
in band versioning information.  However, that does seem to be the vast
majority of cases on the Web and in almost every case with XML.  How
about:
"if the texts of the language contain version identifiers, then texts
where the version identifier is unknown can be treated as if the version
identifier was known."

> 
> Noah
> 
> --------------------------------------
> Noah Mendelsohn
> IBM Corporation
> One Rogers Street
> Cambridge, MA 02142
> 1-617-693-4036
> --------------------------------------
> 
> 
> 
> 
> 
> 
> 
> 
> Dan Connolly <connolly@w3.org>
> Sent by: www-tag-request@w3.org
> 10/04/2007 11:28 AM
> 
>         To:     David Orchard <dorchard@bea.com>
>         cc:     www-tag <www-tag@w3.org>, (bcc: Noah 
> Mendelsohn/Cambridge/IBM)
>         Subject:        Re: Updated versioning strategies doc 
> [XMLVersioning-41  ISSUE-41]
> 
> 
> 
> On Thu, 2007-09-20 at 16:16 -0700, David Orchard wrote:
> > - updated the introduction to hit the 3 main messages right up front
> > 
> http://www.w3.org/2001/tag/doc/versioning-compatibility-strate
> gies-20070920.html
> 
> 
> 
> Dave and I talked over this new material. I can now "see the forest
> for the trees" better as a result. This is the bit with
> the 3 main messages:
> 
> [[
> This finding describes general problems and techniques in evolving
> systems in compatible ways. These techniques are designed to allow
> compatible changes. A number of design patterns and rules are 
> discussed
> with a focus towards enabling versioning. There are a few crucial good
> practices that enable forwards compatible versioning from 
> version 1 of a
> language.
> 
>   The first is to specify the language is extensible.
> 
>   The second is to specify that any text of the language with
>   extensions can be treated as if the extensions were not present.
> 
>   The third is to specify an algorithm for how a text of the
>   language with a version identifier that is unknown can be treated
>   as if the version identifier was known.
> ]]
> 
> (emphasis added by way of list formatting).
> 
> > I expect that these will be revised as per Dan and my action items.
> 
> I offer the above for discussion in today's telcon under...
> 
>   ACTION-51 on David Orchard to And Dan work together to articulate
>     the story that the TAG wants to tell.
> 
> 
> I had some ideas for working on the organization and prose, but
> I didn't get very far with them. I think some of the things
> presented as "Good Practice" notes would work better as a
> pattern language in the sense
> of http://www.c2.com/cgi/wiki?PatternLanguage
> 
> I gave it a try with http://esw.w3.org/topic/IgnoreUnknownTags ...
> I think I/we need to do 2 or 3 more of those to see whether
> it's a good organizational technique overall.
> 
> 
> -- 
> Dan Connolly, W3C http://www.w3.org/People/Connolly/
> 
> 
> 
> 
> 
> 
>
Received on Friday, 2 November 2007 20:53:34 UTC