- From: David Orchard <orchard@pacificspirit.com>
- Date: Tue, 13 May 2008 17:19:40 -0700
- To: www-tag@w3.org
- Message-ID: <2d509b1b0805131719v47583581nba2e9a3f37c6753f@mail.gmail.com>
Replying to Marc's comments. Great review, and thanks. I'll be publishing a new version shortly. I've snipped out the parts that I've agreed with, which is all the editorial and the majority of the non-editorial comments. I've put DBO>> to indicate my responses. 1.2: "Among the various kinds of languages, we find..." It's obvious, but I think it should be made explicit that the doc does not apply to natural language. 1.2: "programming languages such as Java or ECMAScript..." I don't think this Finding, which is mainly about forward compatibility, applies to programming languages either. Suppose the final Python 3 release would include "x" as alternative notation for the multiplication operater. Take the following Python 3 source: def double(i): i = 2 x i return i If a Python 2.5 processor were to process this source in a forward compatible way, it would have to ignore the statement "i = 2 x i" and thus return the input without doubling it. I can't think of any context where such behaviour would be useful. I think there is a difference between languages which contain mainly (text or typed) data and languages which contain processable instructions (admittedly there is a large overlap between those two), and forward compatibility does not apply to the latter category. Most of the 'Good Practices' mentioned in your doc don't apply to programming languages. DBO>> We continue to disagree. There are some areas of flexibility in a language. I think you've picked a hard one, but then HTML doesn't allow extra name characters etc. yet we say it is forwards compatible. 2. "Applications are expected to behave properly" There are at least four common relevant behaviours when an application receives a document with an error: 1) produce an error and fail 2) proceed with errors/warnings 3) give a user the option to continue or abort 4) proceed silently I think the Versioning Doc could mention such distinctions explicitly. DBO>>II added these in with some wordsmithing, but it still feels rough to me. 2.1 par. 3: "Typically, when introducing a new version using the incompatible approach, all of the software that produces or consumes the texts is updated..." In general, the hidden premise in this Finding is one of exchanging messages, which 'disappear' after consumption. But versioning applies equally well to longer-lived documents. There are a few common cases I think deserve mentioning: - A very common approach is when a new consumer meets an old text, is for the consumer to upgrade the text (silently or at user option) to the new version. This approach is particularly common in word processors and databases. - With longer-lived documents and data structures, the relation between a producer and a text may not be one-to-one. For instance, in a database some records may originate from an older producer, others from a newer one. Also in larger markup documents, it is quite possible that some parts have been produced by an older producer, other parts by a newer one. Using version identifiers on a per-record basis in a database is uncommon, as is the use of a version identifier on a per-division (paragraph, sentence, chapter) basis in markup documents. This makes the relation version-document more complicated: should we assume the entire document to be of the version of the latest producer? There is a relation with the previous point, since a common approach is for a consumer/producer to open a document (or database), check the version, ask the user to convert to the latest version if necessary, or simply write new structures in the old document (database) when this is allowed. DBO>> I agree that there is a hidden premise that texts disappear after consumption, hence the use of the term consumption. I note your point about upgrading the document, but I'm unsure how that affects the finding. In the upgrade word processor document approach, I would see that the consumer consumes and upgrades the text, producing a newer version. Which falls into the case mentioned where the consumer has been updated. Also, I don't think that this precludes version identifiers from being in subsections of a document, where the subsections are mapped to various database tables. What would you like to see different? 2.1.1, par. 4: "If a name contains first, last, and middle then the previous options yield answers of: 2, 1, 2, 1-2" This is only true if the language has some ignore-unknown strategy, and 'ignore unknown' hasn't been really introduced at this point. So either make explicit that the language V.1 has an ingnore-unknown strategy approach, or omit this part of the example. DBO>>I slightly disagree, because I'm just specifying the language identification rules here, not the processing rules. The issue of identification vs processing and uncoupling them has been very tough for many years now. I do point forward to the rules for forwards compatibility. 2.1.1.1 general: programming languages, mentioned in 1.2, very often do not have version id's in their texts. C, SQL, Python sourcecode does not mention the version of C, SQL or Python used. DBO>>Agreed. 2.1.1.2 This paragraph raises the interesting but complicated issue whether an XML document with content in several namespaces should be considered a document in one language or in several languages. In one sense, it's all XML, in another sense, nested sublanguages. DBO>>I had earlier text in the strategies document about extensibility vs versioning at http://www.w3.org/tag/docversioning-strategies-20070917.html#iddiv172692152. I expunged this for space reasons.. 3: "As this finding focuses on compatible versioning, we provide no more focus on incompatible evolution." I MUST, strongly, persistently, vehemently object to this utter, complete... - well, words fail me - omission. There is a difference between publishing, HMTL style, for the world, where consumers may do as they wish with whatever is published, and messaging, where senders and receivers are (often contractually) bound. Must accept unknowns is a very good approach, but it will only work in messaging if, and only if, it can be overruled by some 'must understand' indicator. There is no way in medical prescriptions (my background) or stock orders or any serious messaging to have 'must accept unknowns' as a blanket policy for consumers, without being able to overrule this. Would you accept it if your bank executed a stock order above your maximum price, saying 'Well, we are still on v.1.0, which does not have the max price, and we read the W3C's Versioning Strategies, so...'. This Finding needs some explicit texts on overruling 'must accept unknowns' through 'must understand', or similar mechanisms, which are, in effect, mechanisms which force consumers to be incompatible in some circumstances. DBO>>I'm very sensitive to this issue. I have attempted to make the finding as absolutely as short as possible so as to tell a coherent forwards compatible versioning story. This document is, as stated, about achieving compatible versioning. It is not a general document about versioning. I really believe that I have to limit the scope of this beast somehow to get to an actual Finding. I did add a bit in "Another example is where a producer wants to indicate that an extension must be understood. This could be indicated inline using a mustUnderstand model, such as SOAP or an application specific model. " but I am very loathe to add any more text. The document must stop growing. 5: "Please select one of the following 3 alternatives for the finding" There are only 2. I'd prefer the second. As I mentioned before, I do not think this should apply to programming languages. DBO>> the 3rd is the sentence "We have observed that languages that are successfully versioned are generally extensible". I am personally strongly against this 3rd option and a big proponent of the first. 5: "Extensible" - the links don't jump to the definition, but a bit before it. DBO>> I don't know why this is. 5.1, par. 2: "Consumers MUST accept text portions..." I think you should say something about what 'accept' means. See the points on levels of error / failure above. DBO>>hmm... 5.1, pars. 4 and 6: The real distinction is not 'Accept and Ignore' vs. 'Accept and Preserve'(since that approach ignores the content as well) but between 'Ignore and Discard' vs. 'Ignore and Preserve'. DBO>>I think there are 3 rules: Accept, accept and discard, accept and preserve. The first says nothing about discarding or preserving. It's effectively ignore. 5.3, par. 1 "Good Practice: Default Unknown Version Identifier Handling Rule: Languages MUST provide a default model for unknown version identifiers for forwards-compatible evolution." I believe this is the wrong way around. Newer specs (and thus producers) should provide a way for older consumers to know whether they may process a message. The newer ones have the more complete knowledge. They can insert the old version identifier if desired. This good practice, and the following paragraph, assume a too simple approach of langauge versions being either compatible or not. In the Netherlands, we have a annual release of a medical (HL7 based) spec, which contains lots of different messages, some compatible, some not, some partly compatible, etc. There is no way a major.minor language version will do. Even per-message-type major.minor versions are not sufficient, since incompatible content may be optional. I strongly believe the only way to resolve such complexities is if newer producers provide the version identifiers the older consumers expect (of course, only when the older consumers may process the messages). So - in your examples - I'd say the 1.1 producer would haver to insert the 1.0 and 1.1 versions id's, and this approach would not need the above Good Practice. DBO>>I agree with your overall thesis that major.minor doesn't work in non-trivial distributed extensibility. I have struggled for years to express that point but it's not gaining a lot of traction. I've done a big rewrite of the relevent sections. See you in Dublin, DBO>>Indeed, it was really great to see you there! Cheers, Dave
Received on Wednesday, 14 May 2008 00:31:02 UTC