RE: Updated versioning strategies doc [XMLVersioning-41 ISSUE-41]

David,

Some comments on the latest Strategies document.

1.1: "There are many reasons why a different version of a language may be
needed. A few of them include..."

Comment: You could include localization as a driver for versioning. In UBL and
HL7 it is one of the primary drivers for versioning, and it's probably important
in any context where local laws and regualtions play a role.

1.2: "Among the various kinds of languages, we find..."

It's obvious, but I think it should be made explicit that the doc does not apply
to natural language.

1.2: "programming languages such as Java or ECMAScript..."

I don't think this Finding, which is mainly about forward compatiblity, applies
to programming languages either. 

Suppose the final Python 3 release would include "x" as alternative notation for
the multiplication operater. Take the following Python 3 source:

def double(i):
  i = 2 x i
  return i

If a Python 2.5 processor were to process this source in a forward compatible
way, it would have to ignore the statement "i = 2 x i" and thus return the input
without doubling it. I can't think of any context where such behaviour would be
useful. I think there is a difference between languages which contain mainly
(text or typed) data and languages which contain processable instructions
(admittedly there is a large overlap between those two), and forward
compatibility does not apply to the latter category.

1.2: "Languages may be composed of different languages..."

Here, and in the "Just Names" category, you could mention code lists, which are
the prime example of mixed languages in B2B contexts. With code lists with some
volatility (medication codes, country and currency codes) there is always a need
for independent versioning of the code list.

2.1: "The "big bang" approach is appropriate when the new version is radically
different from its predecessor..."

Big Bang is also needed when a security leak is discovered in the previous
version.

4.1: "It is possible to reduce the Defined Text Set by removing items and
achieve backwards compatibility, as long as the newer Language's Accept Text set
contains all the texts originally in the Defined Text Set. One mechanism to do
this is to replace the content with a construct that allows the removed
construct."

I don't think it's possible to remove items and have BC. In your example,
"replace the content with a construct that allows the removed construct" means
it isn't removed, only ignored.  I don't think it's a problem either, there is a
widely applied pattern to handle removal: mark the items-to-be-removed as
obsolete for several versions, and then remove them.

5.1 "Good Practice: Preserve existing information Rule: An Extensible Language
MUST require that any texts with extensions MUST be *compatible* with a text
without the extensions."

I think you use compatible in a different sense here. It's clearly not FC or BC,
so it raises the question when texts are compatible in this sense.

Regards,

Marc de Graauw

http://www.marcdegraauw.com
 

| -----Original Message-----
| From: www-tag-request@w3.org [mailto:www-tag-request@w3.org] 
| On Behalf Of David Orchard
| Sent: vrijdag 26 oktober 2007 19:55
| To: noah_mendelsohn@us.ibm.com; Dan Connolly
| Cc: www-tag
| Subject: RE: Updated versioning strategies doc 
| [XMLVersioning-41 ISSUE-41]
| 
| 
| Righto.  
| 
| I did an update, now available at 
| http://www.w3.org/2001/tag/doc/versioning-compatibility-strategies
| http://www.w3.org/2001/tag/doc/versioning-compatibility-strate
| gies-20071
| 026.html
| 
| Comments inline.
| 
| Cheers,
| Dave
| 
| > -----Original Message-----
| > From: www-tag-request@w3.org [mailto:www-tag-request@w3.org] 
| > On Behalf Of noah_mendelsohn@us.ibm.com
| > Sent: Tuesday, October 16, 2007 6:49 PM
| > To: Dan Connolly
| > Cc: www-tag
| > Subject: Re: Updated versioning strategies doc 
| > [XMLVersioning-41 ISSUE-41]
| > 
| > 
| > Sorry it's taken awhile for me to get back to this.   I think 
| > this is a 
| > step forward, in that it is at least clear and succint in 
| > setting out a few key points.  As to whether I think these 
| > are the right points to make that's a tougher call:
| > 
| > > This finding describes general problems and techniques in 
| evolving 
| > > systems in compatible ways. These techniques are designed 
| to allow 
| > > compatible changes.
| > 
| > OK. 
| > 
| > The next comment is essentially editorial:
| > 
| > > A number of design patterns and rules are discussed with a focus 
| > > towards enabling versioning.
| > 
| > I'm not sure that versioning is a well defined noun when used 
| > in this way. 
| >  In fact, would it make sense to skip, the 2nd sentence, leaving:
| > 
| > "This finding describes general problems and techniques in 
| > evolving systems in compatible ways. A number of design 
| > patterns and rules are discussed with a focus towards 
| > enabling compatible changes (to languages?)."?
| > 
| > > There are a few crucial good practices that enable forwards 
| > compatible 
| > > versioning from version 1 of a language.
| > 
| > Hmm.  Just jumped to forward compatibility, when a minute ago 
| > you said "general problems in evolving systems in compatible 
| > ways".  That surely goes beyond forward compatibility? 
| > 
| 
| True.  I reduced the scope to forwards compatibility.
| 
| > > There are a few crucial good practices that enable forwards 
| > compatible 
| > > versioning from version 1 of a language.
| > > 
| > >   The first is to specify the language is extensible.
| > 
| > Not quite sure what this means.  First of all, I suspect it 
| > should be: 
| > "The first is to specify >that< the language is extensible.", 
| > but even then, what is that adding, except as an 
| > introduction?  It's the syntax and semantics of langauge that 
| > do or don't make it extensible, not the fact 
| > that you say it is.   I don't think you specify that the 
| language is 
| > extensible;  you might instead note that the language is 
| extensible. 
| > Extensibility is an emergent property of the design.
| > 
| > > The second is to specify that any text of the language with 
| > extensions 
| > > can be treated as if the extensions were not present.
| > 
| > This is the one where I have a significant problem.  I think 
| > when you look at many extensible languages they do not 
| > necessarily provide that what we're calling extensions are 
| > completely ignored in by versions.  I've come to believe that 
| > what many important languages do is to provide default 
| > processing rules. 
| > 
| > So you might ask: in what sense is extension going on if the 
| > syntax already has >some< semantic in V1?  Well, what tends 
| > to happen is that in
| > V2 the generic processing rule is replaced by something 
| > specific. Consider a simple language for transferring named 
| > properties, and apply it in particular to names of people.  
| > The specification for V1 might say:
| > 
| > V1 Specification:  Properties are transmitted one per line, 
| > as a space separated pair, with the name of the property 
| > first, and its value second. 
| >  In V1 of the language, the following properties are given special
| > significance:  firstname is used for a person's given name, 
| > and lastname is used for his or her family name.  All 
| > properties are considered part of the name, whether or not 
| > they are given particular meaning in this version of the 
| > specification.  So, for example, an application storing the 
| > name SHOULD record all properties, not just these two, etc.  
| > Note that in future versions of this specification, 
| > additional properties may be given particular significance.
| > 
| > V2 Specification:  The V2 language is the same as V1, except 
| > that the property middlename is now understood to be the 
| > person's middle name.
| > 
| > I think this is a fine example of a forwards compatible 
| > specification. 
| > People do this stuff all the time, but the substitution-based 
| > approach doesn't say anything about it.  RFC 2616 Section 
| > 5.3, for example, specifies that: "Unrecognized header fields 
| > are treated as entity-header fields." and "Request-header 
| > field names can be extended reliably only in combination with 
| > a change in the protocol version."  So, there's a default 
| > behavior in HTTP 1.1 (treat as entity-headers), and a warning 
| > that for some headers future versions might supply a more 
| > distinguished semantic. 
| > I'm pretty sure a similar case could be made regarding the 
| > default relay semantic for SOAP headers, but I don't have 
| > time to remind myself of the pertinent details this evening.
| > 
| > As I say, I think this sort of thing is common and extremely 
| > valuable.  I really would like our finding to tell a story 
| > about it, and I worry that "treat as if the extensions 
| > weren't present" is really just a special case.  I think we 
| > should probably start with the "default semantic", and then 
| > say how one example of a default semantic is "treat as if it 
| > weren't there at all".
| 
| Right.  This reminds me of a conversation I'm having with Ian Hickson
| about the difference between an error that is recovered nicely from
| versus part of the language design with nice recovery from 
| extra things.
| 
| 
| I can live "ignoring" being a special case, and the later parts of the
| finding do exactly that.  In fact, the later part say that "accept and
| throw away", "accept and retain" are both different flavours of must
| accept rules.  
| 
| We are quite in agreement that part of forwards compatibility is
| allowing extra things (bullet #1) and specifying that those 
| extra things
| have a processing model that should not be failure (#2).  
| 
| I've tried to reword the 2nd item into "any extensions in a 
| text of the
| language have a well-defined meaning that at a minimum is that the
| extensions are acceptable;"
| 
| > 
| > > The third is to specify an algorithm for how a text of 
| the language 
| > > with a version identifier that is unknown can be treated 
| as if the 
| > > version identifier was known.
| > 
| > Why are we limiting ourselves to languages in which the 
| > version used is to be explicitly signaled in band in the 
| > text?  Many, many languages (FORTRAN, at least some flavors 
| > of C, Java and many other programming languages come to mind) 
| > never answer explicitly the question: "which versions of the 
| > spec. did I have in mind when I wrote this?", and for those 
| > languages step 3 doesn't apply.  In fact, when you see a text 
| > that's valid per some version, you can never be 100% sure 
| > whether the author intentionally authored to that version, or 
| > accidentally wrote something legal (I.e. made a mistake but 
| > happened to create correct new version
| > syntax) while in fact reading an earlier version of the 
| > specification. 
| > Version id's tend to be a cross check against such things, 
| > IMO, except when a language has evolved in truly incompatible 
| > ways.  Then it's really important to say in the document "you 
| > better interpret this per V5, or else what you conclude may 
| be wrong!"
| > 
| > So, I don't think we should limit ourselves to languages with 
| > in band version signaling.  If we do make such a limiting 
| > assumption, we should say so rather explicitly before making 
| > the third statement.  E.g. "This analysis limits itself to 
| > the special case of languages that can signal within each 
| > text the version(s) of the language to which the document was 
| > authored.  For such languages, the third step is to specify 
| > an algorithm...."
| 
| I never had the intension of limiting ourselves to languages that have
| in band versioning information.  However, that does seem to 
| be the vast
| majority of cases on the Web and in almost every case with XML.  How
| about:
| "if the texts of the language contain version identifiers, then texts
| where the version identifier is unknown can be treated as if 
| the version
| identifier was known."
| 
| > 
| > Noah
| > 
| > --------------------------------------
| > Noah Mendelsohn
| > IBM Corporation
| > One Rogers Street
| > Cambridge, MA 02142
| > 1-617-693-4036
| > --------------------------------------
| > 
| > 
| > 
| > 
| > 
| > 
| > 
| > 
| > Dan Connolly <connolly@w3.org>
| > Sent by: www-tag-request@w3.org
| > 10/04/2007 11:28 AM
| >  
| >         To:     David Orchard <dorchard@bea.com>
| >         cc:     www-tag <www-tag@w3.org>, (bcc: Noah 
| > Mendelsohn/Cambridge/IBM)
| >         Subject:        Re: Updated versioning strategies doc 
| > [XMLVersioning-41  ISSUE-41]
| > 
| > 
| > 
| > On Thu, 2007-09-20 at 16:16 -0700, David Orchard wrote:
| > > - updated the introduction to hit the 3 main messages 
| right up front
| > > 
| > http://www.w3.org/2001/tag/doc/versioning-compatibility-strate
| > gies-20070920.html
| > 
| >  
| > 
| > Dave and I talked over this new material. I can now "see the forest
| > for the trees" better as a result. This is the bit with
| > the 3 main messages:
| > 
| > [[
| > This finding describes general problems and techniques in evolving
| > systems in compatible ways. These techniques are designed to allow
| > compatible changes. A number of design patterns and rules are 
| > discussed
| > with a focus towards enabling versioning. There are a few 
| crucial good
| > practices that enable forwards compatible versioning from 
| > version 1 of a
| > language.
| > 
| >   The first is to specify the language is extensible.
| > 
| >   The second is to specify that any text of the language with
| >   extensions can be treated as if the extensions were not present.
| > 
| >   The third is to specify an algorithm for how a text of the
| >   language with a version identifier that is unknown can be treated
| >   as if the version identifier was known.
| > ]]
| > 
| > (emphasis added by way of list formatting).
| > 
| > > I expect that these will be revised as per Dan and my 
| action items.
| > 
| > I offer the above for discussion in today's telcon under...
| > 
| >   ACTION-51 on David Orchard to And Dan work together to articulate
| >     the story that the TAG wants to tell.
| > 
| > 
| > I had some ideas for working on the organization and prose, but
| > I didn't get very far with them. I think some of the things
| > presented as "Good Practice" notes would work better as a
| > pattern language in the sense
| > of http://www.c2.com/cgi/wiki?PatternLanguage
| > 
| > I gave it a try with http://esw.w3.org/topic/IgnoreUnknownTags ...
| > I think I/we need to do 2 or 3 more of those to see whether
| > it's a good organizational technique overall.
| > 
| > 
| > -- 
| > Dan Connolly, W3C http://www.w3.org/People/Connolly/
| > 
| > 
| > 
| > 
| > 
| > 
| > 
| 
| 
| 

Received on Monday, 5 November 2007 11:17:11 UTC