RE: Comments on July 26, 2006 Versioning Draft from Marc de Graauw on 2006-09-28 (www-tag@w3.org from September 2006)

From: Marc de Graauw <marc@marcdegraauw.com>
Date: Thu, 28 Sep 2006 22:30:09 +0200
To: "'David Orchard'" <dorchard@bea.com>, <www-tag@w3.org>
Message-ID: <005201c6e33c$e43f89b0$0096070a@MARCNOTE>
David Orchard:

| Thank you very much for your detailed comments!  I have gone through
| them in detail with answers inline.  Please note that I am 

Thank you for your response!

| publishing a
| new version of the finding on Friday which has a significant 
| rewrite of
| the terminology section .  I think this has picked up some of your
| comments.  We will be reviewing this next week at our F2F 
| meeting, which
| I expect will result in Yet Another Terminology Rewrite.  I suggest
| looking at that post next rewrite to see if your comments have been
| completely addressed.  

I will.

| There are a couple of questions that I 
| asked, and
| any answers at any time would help and answers in the next day or so
| might make it in by Friday.
|
| > 3.1: "If the language can be extended in a compatible way, 
| > then a few specific schema design choices must be followed." 
| > Further on you describe the possibility to transform new 
| > (extended) instances to older instances. If a language makes 
| > such transformation (strip all unkown content) required, the 
| > Schema's do not need extensibility (with wildcards), so the 
| > "must" in this sentence is too strong. 
| > 
| I think I'll phrase it as "if the language is intended to be 
| capable of
| compatible extensibility" 
| That way the MUST is still true.

This is not exactly what I meant. A language can be made (partially or
fully) compatible in several ways. Using wildcards in a schema is one way.
However, if the language specification imposes processing rules such that:
1) in pre-schema-validation processing all unknown content is removed,
2) after this step, schema-validation is done,
it is possible to achieve forwards compatibility without wildcards.

I don't know whether this is a good thing to do, I just meant that there are
other ways than the schema design choices you mention to achieve forwards
compatibility.
  
| > 4, Good Practice #1: "Languages SHOULD be designed for 
| > extensibility." I feel this is a bit too strong. Most 
| > exchange languages I know of do not implement extensibility 
| > mechanisms in the way you describe, and although this is a 
| > SHOULD, not a MUST, it still means a lot of well-functioning 
| > languages violate this Good Practice. Extensibility should be 
| > an option for a language designer, not a SHOULD. You yourself 
| > show with your discussions of closed systems and security 
| > languages there are perfectly good reasons for not using 
| > extensibilty. I myself work in Healthcare, and a Must-Ignore 
| > default to medical information is often not the way to go either...
| > 
| 
| I can understand that pushback and it's totally fair.  In 
| general, this
| has a tough time striking a balance between incompatible 
| versioning and
| compatible versioning.  I have erred on the side of compatible
| versioning, because I know how difficult it can be to design 
| systems for
| this.  I would rather not get into a versioning finding that 
| says "well,
| you can version.  You can do it incompatibily or compatibly.  You
| choose."  If I was to do anything, I would make the finding a harder
| line on compatible versioning, and change it to "compatible versioning
| finding" rather than generalizing.  I prefer findings that say "do x".
| I know it's a choice between x and not x, but I think there 
| is much more
| pain in the world for not planning for compatible versioning 
| than there
| is for planning for compatible versioning.  I hope that helps explain
| the motivation, feel free to pushback again. 

I understand the motivation, will consider possible pushback later :-)

| > 9: There is another strategy to versioning which you do not 
| > mention: a producer simply lists in an instance which 
| > consumer versions may process the message. A producer could 
| > thus simply say "Consumers who understand version
| > 2 or 3 may process this message". The advantage is you don't 
| > need mustUnderstand flags everywhere. If a newer version of a 
| > language L2 contains an optional item whose understanding is 
| > mandatory, the producer could require L2 consumers if the 
| > optional item occurs, and L1 or L2 consumers if the optional 
| > item does not occur. Of course the number of versions could 
| > theoretically become high, but in practice there often aren't 
| > that many versions of a language: we have two XML's, two 
| > SOAP's, two UBL's, so this approach is feasible in practice. 
| > It works for forward (in)compatibility since it requires a 
| > newer producer who knows the capabilities of older consumers. 
| 
| Interesting approach.  However, I'm not quite sure I follow it
| completely.  Let's take my name/given/family/middle example.  
| If I have
| V2 which adds optional middle, would it look something like:
| 
| <name xmlns="http://www.example.org/name/1" worksForVersion="1">
|   <given>Dave</given>
|   <family>Orchard</family>
| </name>
| 
| <!-- then a producer that knows about V1 and V2 creates an 
| instance that
| doesn't have the middle -->
| <name xmlns="http://www.example.org/name/1" worksForVersion="1 2">
|   <given>Dave</given>
|   <family>Orchard</family>
| </name>
| 
| <name xmlns="http://www.example.org/name/1" worksForVersion="2">
|   <given>Dave</given>
|   <family>Orchard</family>
|   <middle>Bryce</middle>
| </name>
| 
| ?  I think the idea of listing multiple versions is extremely
| interesting.  

Yes, this is exactly what I meant.

It would allow a level of granularity which other approaches (mustUnderstand
flags etc.) cannot deliver. For instance, with a mustUnderstand flag one can
enforce a new element type to be understood by an older processor:
<newElement mustUnderstand="1">X</newElement>, but with a worksForVersion it
is even possible to enforce understanding on the content level.  

For instance, if in L1 element <traffic-light> has content type string, and
the language enforces a particular behaviour for content 'red' and 'green',
other values are ignored. Then L2 could also allow 'orange' and 'blue', and
require 'orange' to be understood (by stating worksForVersion="2" in all
instances which contain the value 'orange', yet allow 'blue' to be ignored
by older versions (by stating worksForVersion="1 2"). 

The producer has full knowledge of its own language version, and all
previous versions, so the producer can indicate in the instance which
language versions are required by the consumer, and thus achieve optimal
(per-instance) forwards compatibility (if desired, of course it adds a level
of complexity in producers as well). 

| I have a long time blog post sitting on the 
| backburner to
| talk about what does a single version # *really* mean in 
| scenario #2 (as
| in, whats the version *of*).

Would like to read that.

| > In general can backwards (in)compatibility be defined in the 
| > language specification, since the implementer of a new 
| > consumer will know from the specification which older 
| > versions of the language are processable by it, but forwards 
| > (in)compatibility must be defined in the instance, since the 
| > implementer a newer producer may not know which versions the 
| > consumers are. 
| 
| I think that's right.  Languages always know about the previous
| languages, but languages don't know about the subsequent languages..

I need to work this out, I have a feeling that this principle, together with
the assymetrical relationship producer/consumer, and the distinction between
syntactical and semantical compatibility (wildcards are about syntactical
compatibility, mustUnderstand/worksForVersion flags about semantical
compatibility) could very well lead to some general principles in
compatibility and versioning. Your document surely provoked a lot of thought
here.

| > 10. You probably should mention one drawback to extensioning: 
| > if multiple parties "invent" the same (functional) extension 
| > which comes in a new version, getting the extensions back in 
| > sync in the new version may meet with opposition. I don't say 
| > this is a reason for not using extensioning, but I think in 
| > fairness it should be mentioned.
| 
| I think this is a big advantage of namespaces.  That way 2 different
| entities can't come up with the same extension.  So I don't see that
| problem coming up in the context of section 10.  Or do you mean that
| there are identically named and incompatible extensions 
| within the same
| namespace? 

No, what I meant is if some parties introduce middle names within their own
namespaces in several different ways, and L2 wants to introduce middle
names, a huge row will follow because everybody will want their version of
the middle name component to become the L2 standard. It's not an important
point, I think in almost all scenario's the benefits of extensions will
outweigh the costs of. Without extensions, parties who need middle names
will just send:

<name xmlns="http://www.example.org/name/1" worksForVersion="2">
  <given>Dave Bryce</given>
  <family>Orchard</family>
</name>

which is worse than the worst possible extension. Attribute abuse by
end-users is one of the foremost problems in databases, in exchanges it
won't be different.

Hope this clarifies the previous remarks, good luck with the next version.

Oh, and by the way, I'd be interested in your opinion on my other post:
http://lists.w3.org/Archives/Public/www-tag/2006Aug/0104.html

Since it's partly a terminological issue, it might be addressed by your 'Yet
Another Terminology Rewrite', however.

Cheers,

Marc
Received on Thursday, 28 September 2006 20:31:19 UTC