- From: Dare Obasanjo <dareo@microsoft.com>
- Date: Thu, 23 Oct 2003 09:15:26 -0700
- To: "Dan Connolly" <connolly@w3.org>, <www-tag@w3.org>
I'd also want to add my voice to the request that the XML Schema specific bits be removed from any documents on versioning XML vocabularies produced by the TAG. The main problem I have with doing this is that W3C XML Schema does not directly support the most common way XML vocabularies are versioned in the wild (change the version attribute & stick in more elements & attributes) without seemingly absurd contortions and limitations. I'd rather not see a TAG document on versioning contain rationalizations of what are basically design flaws in W3C XML Schema nor would it make sense to promote them as good architecture for the WWW. ________________________________ From: www-tag-request@w3.org on behalf of Dan Connolly Sent: Fri 10/17/2003 5:44 AM To: www-tag@w3.org Subject: on "Versioning XML Languages" My comments on http://www.w3.org/2001/tag/doc/versioning.html of 18 Sep 2003 Summary: The enumeration of strategies in section 2 Versioning Strategies is good and important stuff, but the thesis of the finding is either buried or off the mark, the boxed points are insufficiently justified, and it needs a lot of editorial work (terminology is loose, items in references section aren't cited in the body, etc.). caveat: I didn't read the whole thing; I sorta lost the story line around section 5 or 6. I could stand for the XML Schema specific bits to be split out into a separate document. Comments as I read it: |XML is designed for the creation of tag sets, languages of elements and attributes. The term "tag set" is introduced here but not used elsewhere. suggest: XML, Extensible Markup Language, provides common constructs, elements and attributes, for use in a large and growing variety of languages. | XML is self-describing in the minimal sense that any XML parser can recognize the namespaced elements and attributes, attribute values, and text content of a document. Hmm... I don't see how that makes it self-describing. suggest striking that. XML documents that include DTDs are self-describing in that the DTD part describes the other part. XML documents can participate in a self-describing Web by way of namespace names and namespace documents (I prefer the term "grounded in the web" to "self-describing" for that sense anyway). I suppose you could say that <partnum>43</partnum> is somewhat self-descriptive, but only to an agent that has some prior understanding of the term "partnum". | It is designed for the combination of languages in instance documents. suggest: Its self-similar syntax supports documents composed from multiple languages. | This paper discusses how developers can design with extensibility and change in mind, making backward-compatible and forward-compatible changes possible in the future. "This paper discusses..." -- now *that's* self-describing. Too meta for my tastes. Do we need this sort of preface material in addition to the abstract? I think not. Hmm... right into 1.1 Terminology without actually establishing the thesis of the finding. Surely the thesis is something about evolution of XML languages being a necessary part of the continuous evolution of the Web, yes? Perhaps you make that point later... I'm skipping the 1.1 Terminology section; I have a long-standing distaste for the definitions-up-front style ala ISO specs. I much prefer definitions in context; collect them in a glossary at the end if you like, but don't make me slog thru them before they're motivated. "Component" is defined in this section but not used elsewhere. What's up with that? Hmm... here's a candidate for the thesis: The primary motivation to allow instances of a language to be extended is to decentralize the task of designing, maintaining, and implementing extensions. It allows senders to change the instances without going through a centralized authority. But I'm not sure that's the main point to be made about language evolution. I think the main point is that agents in the web come and go, with varying capabilities. New language features appear to express new capabilities and concepts, but old agents don't go away... at least not right away. Are "instances of a language" really extended? I suppose by "instances of a langauge" you mean documents. Documents don't change; they're like numbers. The number 4 never changes. Well... there's another sense of the word "document" ala file (more generally: resource), which does have state, but I don't think that's what you're talking about here. I think you're using "instance" as a synonym (or specialization) of 'representation'. A thought: an extensible language is one with some syntax reserved for future use. To extend a language is to say what some of the reserved parts of the syntax mean. Hmm... 1.4 Why Do Languages Change? seems to miss the main point too. The main reason languages change is that the agents that use them change, and the things that languages are used to talk about (i.e. life, the world, business, poetry, media, data, research, etc.) change. | Using QNames to identify words in the WordNet database, for example, or the names of functions and operators in XPath2 are examples of "just name" languages. How so? WordNet is a rich structure of generalizations and specialiations, not just a list of names. The functions and operators in XPath2 are surely more than just a list of names; the names are connected to a data model, to datatype semantics, etc. |This is by no means an exhaustive list. Nor are these categories completely clear cut. Then what's the point of this section 1.6 Kinds of Languages? |Applications are expected to behave properly Hmm... elsewhere in the webarch doc and this finding we speak of "agents". Now we have "Applications". Is this separate term really called for? |4.1 An Example |Throughout this paper, we'll motivate our discussion of versioning with an ongoing example. Yes, please! Please give readers the example (or at least a start) before asking them to slog thru defintions. (As I said in earlier feedback, I'd like this finding to take a more historical approach: tell stories of what we know about the history of evolution of data formats, from RFC822, to HTML, to XML, to SOAP and so on) | The processor must understand ... First agent, then application, now processor. I don't see motivation for the distinct terms. | Any Namespace: The language SHOULD provide for extension in any namespace. I'm not sure I agree with that. I certainly don't see enough justification that I could convince some WG of that point based on this text. This point is justified by only one example, and a hypothetical one at that. Also, SHOULD/MUST/MAY are for agents in protocols. Languages don't do things; they just are. We don't say "numbers should be greater than 4". Either they are or they aren't. "The language" seems odd too... which language? Do you mean "All languages"? | Full Extensibility: All XML Elements SHOULD allow any attributes and allow any elements in their content models. Again, I'm not sure I agree and I don't see enough justification here to convince typical WG members. | The key value of the extension strategy described above is that existing XML documents can be extended without having to change existing implementations. Yes! There's your thesis. Well, generalize it a bit so it's not specific to "the strategy described above". And, editorially, we're now up to 4 terms for the same concept: Agent, Application, processor, and implementation. | Must Ignore: Receivers MUST ignore There's #5! I'm confused by that use of MUST. Taken out of context, it seems to refer to *all* receivers of any kind. But it's in a "good practice" box, not a "constraint" box, and it's prefaced by "For many applications...". -- Dan Connolly, W3C http://www.w3.org/People/Connolly/
Received on Thursday, 23 October 2003 12:16:59 UTC