Re: Updated versioning strategies doc [XMLVersioning-41 ISSUE-41] from noah_mendelsohn@us.ibm.com on 2007-10-17 (www-tag@w3.org from October 2007)

From: <noah_mendelsohn@us.ibm.com>
Date: Tue, 16 Oct 2007 21:49:26 -0400
To: Dan Connolly <connolly@w3.org>
Cc: www-tag <www-tag@w3.org>
Message-ID: <OFEECBEF7F.EF7345B4-ON85257377.000680FC-85257377.0009D25B@lotus.com>
Sorry it's taken awhile for me to get back to this.   I think this is a 
step forward, in that it is at least clear and succint in setting out a 
few key points.  As to whether I think these are the right points to make 
that's a tougher call:

> This finding describes general problems and techniques in evolving
> systems in compatible ways. These techniques are designed to allow
> compatible changes.

OK. 

The next comment is essentially editorial:

> A number of design patterns and rules are discussed with a 
> focus towards enabling versioning.

I'm not sure that versioning is a well defined noun when used in this way. 
 In fact, would it make sense to skip, the 2nd sentence, leaving:

"This finding describes general problems and techniques in evolving 
systems in compatible ways. A number of design patterns and rules are 
discussed with a focus towards enabling compatible changes (to 
languages?)."?

> There are a few crucial good practices that enable forwards 
> compatible versioning from version 1 of a language.

Hmm.  Just jumped to forward compatibility, when a minute ago you said 
"general problems in evolving systems in compatible ways".  That surely 
goes beyond forward compatibility? 

> There are a few crucial good practices that enable forwards 
> compatible versioning from version 1 of a
> language.
> 
>   The first is to specify the language is extensible.

Not quite sure what this means.  First of all, I suspect it should be: 
"The first is to specify >that< the language is extensible.", but even 
then, what is that adding, except as an introduction?  It's the syntax and 
semantics of langauge that do or don't make it extensible, not the fact 
that you say it is.   I don't think you specify that the language is 
extensible;  you might instead note that the language is extensible. 
Extensibility is an emergent property of the design.

> The second is to specify that any text of the language with 
> extensions can be treated as if the extensions were not present.

This is the one where I have a significant problem.  I think when you look 
at many extensible languages they do not necessarily provide that what 
we're calling extensions are completely ignored in by versions.  I've come 
to believe that what many important languages do is to provide default 
processing rules. 

So you might ask: in what sense is extension going on if the syntax 
already has >some< semantic in V1?  Well, what tends to happen is that in 
V2 the generic processing rule is replaced by something specific. Consider 
a simple language for transferring named properties, and apply it in 
particular to names of people.  The specification for V1 might say:

V1 Specification:  Properties are transmitted one per line, as a space 
separated pair, with the name of the property first, and its value second. 
 In V1 of the language, the following properties are given special 
significance:  firstname is used for a person's given name, and lastname 
is used for his or her family name.  All properties are considered part of 
the name, whether or not they are given particular meaning in this version 
of the specification.  So, for example, an application storing the name 
SHOULD record all properties, not just these two, etc.  Note that in 
future versions of this specification, additional properties may be given 
particular significance.

V2 Specification:  The V2 language is the same as V1, except that the 
property middlename is now understood to be the person's middle name.

I think this is a fine example of a forwards compatible specification. 
People do this stuff all the time, but the substitution-based approach 
doesn't say anything about it.  RFC 2616 Section 5.3, for example, 
specifies that: "Unrecognized header fields are treated as entity-header 
fields." and "Request-header field names can be extended reliably only in 
combination with a change in the protocol version."  So, there's a default 
behavior in HTTP 1.1 (treat as entity-headers), and a warning that for 
some headers future versions might supply a more distinguished semantic. 
I'm pretty sure a similar case could be made regarding the default relay 
semantic for SOAP headers, but I don't have time to remind myself of the 
pertinent details this evening.

As I say, I think this sort of thing is common and extremely valuable.  I 
really would like our finding to tell a story about it, and I worry that 
"treat as if the extensions weren't present" is really just a special 
case.  I think we should probably start with the "default semantic", and 
then say how one example of a default semantic is "treat as if it weren't 
there at all".

> The third is to specify an algorithm for how a text of the 
> language with a version identifier that is unknown can be 
> treated as if the version identifier was known.

Why are we limiting ourselves to languages in which the version used is to 
be explicitly signaled in band in the text?  Many, many languages 
(FORTRAN, at least some flavors of C, Java and many other programming 
languages come to mind) never answer explicitly the question: "which 
versions of the spec. did I have in mind when I wrote this?", and for 
those languages step 3 doesn't apply.  In fact, when you see a text that's 
valid per some version, you can never be 100% sure whether the author 
intentionally authored to that version, or accidentally wrote something 
legal (I.e. made a mistake but happened to create correct new version 
syntax) while in fact reading an earlier version of the specification. 
Version id's tend to be a cross check against such things, IMO, except 
when a language has evolved in truly incompatible ways.  Then it's really 
important to say in the document "you better interpret this per V5, or 
else what you conclude may be wrong!"

So, I don't think we should limit ourselves to languages with in band 
version signaling.  If we do make such a limiting assumption, we should 
say so rather explicitly before making the third statement.  E.g. "This 
analysis limits itself to the special case of languages that can signal 
within each text the version(s) of the language to which the document was 
authored.  For such languages, the third step is to specify an 
algorithm...."

Noah

--------------------------------------
Noah Mendelsohn 
IBM Corporation
One Rogers Street
Cambridge, MA 02142
1-617-693-4036
--------------------------------------








Dan Connolly <connolly@w3.org>
Sent by: www-tag-request@w3.org
10/04/2007 11:28 AM
 
        To:     David Orchard <dorchard@bea.com>
        cc:     www-tag <www-tag@w3.org>, (bcc: Noah 
Mendelsohn/Cambridge/IBM)
        Subject:        Re: Updated versioning strategies doc 
[XMLVersioning-41  ISSUE-41]



On Thu, 2007-09-20 at 16:16 -0700, David Orchard wrote:
> - updated the introduction to hit the 3 main messages right up front
> 
http://www.w3.org/2001/tag/doc/versioning-compatibility-strategies-20070920.html

 

Dave and I talked over this new material. I can now "see the forest
for the trees" better as a result. This is the bit with
the 3 main messages:

[[
This finding describes general problems and techniques in evolving
systems in compatible ways. These techniques are designed to allow
compatible changes. A number of design patterns and rules are discussed
with a focus towards enabling versioning. There are a few crucial good
practices that enable forwards compatible versioning from version 1 of a
language.

  The first is to specify the language is extensible.

  The second is to specify that any text of the language with
  extensions can be treated as if the extensions were not present.

  The third is to specify an algorithm for how a text of the
  language with a version identifier that is unknown can be treated
  as if the version identifier was known.
]]

(emphasis added by way of list formatting).

> I expect that these will be revised as per Dan and my action items.

I offer the above for discussion in today's telcon under...

  ACTION-51 on David Orchard to And Dan work together to articulate
    the story that the TAG wants to tell.


I had some ideas for working on the organization and prose, but
I didn't get very far with them. I think some of the things
presented as "Good Practice" notes would work better as a
pattern language in the sense
of http://www.c2.com/cgi/wiki?PatternLanguage

I gave it a try with http://esw.w3.org/topic/IgnoreUnknownTags ...
I think I/we need to do 2 or 3 more of those to see whether
it's a good organizational technique overall.


-- 
Dan Connolly, W3C http://www.w3.org/People/Connolly/
Received on Wednesday, 17 October 2007 01:48:07 UTC