- From: <noah_mendelsohn@us.ibm.com>
- Date: Mon, 25 Feb 2008 20:43:55 -0500
- To: "David Orchard" <dorchard@bea.com>
- Cc: www-tag@w3.org
In preparation for the F2F, I'm doing another readthrough of the Versioning Strategies document [1]. Here are some comments, in document not priority order: -------- Section 1.2: > Some typical backwards- and forwards-compatible changes: > > * adding optional components ( in XML, this is generally > elements and/or attributes) Optional elements are only compatible changes if their presence doesn't modify or conflict with the semantics of an existing element. If I add a component that changes the semantics of some other component (e.g. ignore this, mustUnderstand, currencyUnit=Pesos) then the change is often not compatible (and especially not forwards compatible), even if the new component is syntactically optional. -------- Section 2: > In broad terms, the strategies to versioning fall into a numberof classes (editorial) I think that should be: "In broad terms, strategies for versioning fall into a number of classes" -or- "In broad terms, strategies for versioning may be grouped into a number of classes" -------- Section 2: > Different application domains will choose different approaches. (Editorial) I don't think a domain is something that typically makes a choice. Suggest something along the lines of: "Different choices may be appropriate for different applications" -or- "Different choices may be appropriate for different application domains." -------- Section 2: > The dependencies makes it imperative to plan for versioning from the start. Two concerns: 1) It's not clear what dependencies you're talking about, 2) (grammar) "dependencies" is plural, so I think the phrase should be "The dependencies >make< it imperative -------- Section 2.1: > "Big bang" is a very coarse-grained approach to versioning. It > establishes a single version identifier, either a version > number or namespace name, for an entire text. I'm confused about this at a few levels. First, it seems to imply that using a single namespace for all of the content in one particular instance document (a text) is a mistake. I don't think that's what you meant. I suspect that the point you're making really doesn't have to do directly with namespaces or version identifiers, but has to do with policies for knowing how much of a document a consumer can continue to safely process given that some of the content is not what was expected. That issue can arise, and there are solutions that are possible, without any reference to either namespaces or version identifiers, I think. Just as one example, most any syntactic marker that indicates that processing of contents is optional (such as mustUnderstand="false" in SOAP) seems to achieve non-big-bang versioning without any mention of either namespaces or version ids. If I'm right about this, then the remainder of 2.1 needs to be rewritten to define big bang in a way that's less focussed on version identifiers. ----------- Section 2.2 starts with a hyperlink to "Forwards compatible". Shouldn't that hyperlink be where the term is first used? Note that there is text that seems quite close to another definition of forwards and backward compatible in the bullets immediately under the start of section 2. ----------- Section 2.2.2: > Forwards compatible evolution of a language means that > producers of texts in language should be able (typo) should be "in >a< language" ---------- Section 2.2.2: > A supreme example of the benefits of extensibility is HTML. Seems a bit strong. Suggest: "A good example of the benefits of extensibility is HTML. " ---------- Section 2.2.2.1: Formatting bug. The header is indendented under the Good Practice Note above. This is a bit in either the XSL or CSS stylesheets. I've found that this happens if a GPN is the last thing in a section. You can beat it by putting an empty paragraph in the XML just after the GPN, and ahead of the header for 2.2.2.1. --------- Section 2.2.2.1: > "By the definition of Extensibility, there is a mapping from > all texts with additional syntax to texts without." This formulation keeps coming up, and I still believe it's much too limiting. Version 1 need not know how to map all extension content to some text in which the extension content doesn't appear: it must have some default rule for interpreting the extension content. That rule need not be to ignore it entirely for all applications. For example, there's no reason a version 1 storage application shouldn't store the entire contents of a document it receives, including markup it doesn't fully understand. Of course, it will need to understand any markup that controls the storage operation, but other content it can blindly store. Maybe it does more than that: maybe by default it removes whitespace from extension content and stores the results. Many important and interesting applications do extensibility this way. I'm really reluctant to imply that the only default processing rule is a mapping that makes the extension content disappear. ------------ Section 2.2.2.1: > "Must Accept Unknowns Rule: Consumers MUST accept any text > portion that they do not recognize." This is way to strong in my opinion. Let's say I decide to make an extension to XML that allows attributes to be quoted with backquotes as well as forward quotes: <element a=`backquotedattr`/> Are you really saying that XML violates good practice because XML processors will reject this "text that it does not recognize". Taken to its extreme, this GPN seems to imply that an XML processor should quietly accept a FORTRAN program as just one giant "text that it doesn't recognize". In fact, almost all extensible languages have syntactic rules that are fixed and that cannot be changed without breaking compatibility. Within those rules, extensible languages encourage processors to accept and provide some default interpretation to content that wasn't specified in detail in the original specification. (Note that the first versions of HTML do allow for <img> just as they allow for <frog> and <banana>; what they don't do is call out those tags specifically or given them any distinguished interpretation). ----------- Section 2.2.2.1: > Preserve existing information Rule: An Extensible Language MUST > require that any texts with extensions MUST be compatible with > a text without the extensions. For the reasons stated above, I don't think this is right in all cases either. I think an extensible language must provide for a default interpretation of any extensions that may be present. Why that interpretation should be equivalent to some text without the extension isn't motivated in the finding at all. In fact, I think SOAP is a counter example, if you're willing to grant that headers are extensions. I'm fairly sure that if in SOAP I have a header that's mustUnderstand="false", I need not process it locally, but I think that if I'm an intermediary I in general MUST relay that header downstream, even though I don't understand it. If the GPN requires that the message be equivalent to one without the header, how can I do that? So, that's another example of a default processing rule for extension content: don't ignore it, relay it. Ah... I see now that you acknowledge this use case: > Must Accept and Preserve Unknowns Rule (Must Accept variant 2): > Consumers MUST accept and preserve any text portion that they > do not recognize. I feel pretty strongly that this flat out contradicts the first GPN quoted above. Either the text is equivalent to one without the extension OR you must preserve it. I don't think you can have it both ways. The HTTP proxy in your example is not acting as if the document had some equivalent without the extensions. I would drop that first GPN, probably replacing it with one that requires a default processing rule for extension content. ----------- Section 2.2.2.1 > In tree based languages, which includes all markup languages, I can't invent a markup language that would, for example, represent graphs more general than trees? ----------- Section 2.2.2.1: GENERAL COMMENT I think that we usually use Good Practice Notes for things that you usually SHOULD do. Some of the GPN's in this finding seem to be alternatives that are advertised as being more or less equally good choices. I think you need to do some explaining as to which GPNs are to be followed as good practice more or less all the time, and which are intended as alternatives. For example "Must Accept ALL" vs. "Must accept container" seem to be just two choices, and they are mutually exclusive. ---------- Section 2.2.2.3 > Languages MUST provide a substitution model for version identifiers for forwards-compatible evolution. (do we put MUST's in GPN's? Uusually, SHOULD feels better in a GPN. Maybe it's OK, not sure) Anyway, as an example of this you give: > There could be an algorithmic approach. For version numbers, > one could say that version numbers will only have a "major" > change if there is an incompatible change. For example, version > 1.1 of a language is by definition compatible with version 1.0 > and version 2.0 is incompatible. I understand why this is a common strategy and sometimes a good one. I'm less clear on why it's an example of a substitution rule. What's being substituted for what when I say "gee, I can't process this document because it's got a newer major version number than I understand?" -------- Section 3.0: > there are some key requirements the language designer consider > in choosing a strategy and design. (typo) there are some key requirements the language designer >should< consider in choosing a strategy and design. -------- Section 3.0: > It is sometimes desirable to prevent 3rd parties from extending > languages, but it does happen. > An example may be a tightly constrained security environment > where distributed authoring is considered a "bug" rather than a feature. The first sentence doesn't parse unambiquously. Is it "preventing" that does happen, or "extending". On first reading, I assumed you meant extending, then when I saw the next sentence I thought probably not. Either way, I think the sentence should be reworded. How about: "Allowing 3rd parties to extend a language can be extremely valuable for applying the language to new use cases, but in other situations such extensibility may be undesirable. For example, there may be situations in which security concerns dictate that the language specification be centrally maintained." -------- Section 3.3: > If so, a substitution mechanism is required for forwards compatibility. Suggest: "If so, a default processing rule is required for forwards compatibility." (same reason as discussed earlier) ------ I've skimmed the rest of the draft, but not reviewed it in as much detail. I think the general nature of my comments would be similar. I hope they are helpful. Thank you. Noah [1] http://www.w3.org/2001/tag/doc/versioning-strategies-20070917.html -------------------------------------- Noah Mendelsohn IBM Corporation One Rogers Street Cambridge, MA 02142 1-617-693-4036 --------------------------------------
Received on Tuesday, 26 February 2008 01:43:19 UTC