- From: David Orchard <dorchard@bea.com>
- Date: Tue, 12 Jun 2007 14:38:22 -0700
- To: "Norman Walsh" <ndw@nwalsh.com>
- Cc: <www-tag@w3.org>
Follow-on comments inline > -----Original Message----- > From: Norman Walsh [mailto:ndw@nwalsh.com] > Sent: Monday, May 21, 2007 7:11 AM > To: David Orchard > Cc: www-tag@w3.org > Subject: Re: (Partial) review of Versioning XML > > / "David Orchard" <dorchard@bea.com> was heard to say: > |> Really? How does "used by a particular application" fit > in? I would > |> have thought that we meant the set of instances that > conform to the > |> rules of the language independent of any particular application. > |> Surely my XML language is a language even before there are any > |> applications that are expecting to process it. > | > | Right.. Here's what I think is the right solution.. > "Definition: An > | XML Language is a Language where the text MUST be well-formed XML" > > I think that's just a definition of XML. I'd expect a > language to have some extra-syntactic constraints too: a > grammar of some sort. How about "Definition: An XML Language is a Language where the text MUST be well-formed XML and the texts are usually constrained by a schema language. The schema language may be machine processable such as DTDs, XML Schema, Relax NG, or the schema language may be human readable text." > > |> > as purchase orders. The purchase order texts may contain > |> name elements. > |> > Thus instances of a language are always part of a text > |> and also may > |> > be the > |> > |> This paragraph begins with a definition of the term > "instance" as a > |> specific, discrete Text, but this sentence says that instances are > |> always part of a text. I don't find those two uses of the word > |> "instance" compatible. What did you mean? > | > | I'm trying to come up with something where an instance is specific > | Text, but can also use the word instance to talk about a > fragment of > | text. In the example of a PO that contains a Name, the PO > in it's entirety is an > | instance, and so is the Name "part". What do you think? > > I think that's going to be confusing. If you want to make the > distinction between an instance as a specific Text and an > instance as a fragment, I think you're going to have to be > very careful to always say either "instance document" or > "element instance" or something like that; a qualified > "instance" in every case. Yuck. But in most cases, I don't think we need to make the distinction. For our purposes, it's just the text extracted from a document either the whole document or the element. Do you think that the common understanding of an XML Instance is both or one of those? > > |> I suggest you drop the references to information models. > As far as I > |> can tell, you don't refer to it anywhere else in the document. > | > | I kept it because I wanted to differentiate the information > Set that > | our part 1 talks about, and the XML specific information > Set. Perhaps > | a bit more elaboration? Or still do you think it should be dropped. > > I didn't find any place where I felt reference to an > information model would have helped, but perhaps that's > becaues I'm so familiar with XML information models. If you > think it's an important point then I think more elaboration, > or some subsequent reference to it, is necessary. I think it will become necessary when we flesh out the notions of compatibility, which are related to information extracted. > > |> > * Fidelity of XML Schema for the versions of the language. > |> > |> I don't understand that sentence. > | > | Fixed by saying "Fidelity (or richness or degree of description) of > | XML Schema for the versions of the language. By fidelity, > we mean the > | degree to which the language is described. " > > I don't think that helps. "Fidelity" is about accuracy or > faithfulness to some standard, it isn't about richness or > elaborateness. I disagree a bit because I think fidelity has degrees such as "high" or "low" that resonate with people. But I can live with accuracy or completeness and precision might even be a good word too. How about "Accuracy of XML Schema for the versions of the language. " <snip/> > > |> [...] > |> > The use of types and the ability to re-use these types > |> > across elements is an important factor in component version > |> > identification. > |> > |> How so? > | > | How about: The decision to use types and re-use types across > | components is an important factor in component version > identification > | because the component definition and the component's type may be > | versioned separately. > > I'll have to see that in context again, but I think it's better. Let me know. > > |> > 3.3 Version Numbers > |> [...] > |> > 4 Component version identification strategies > |> [...] > |> > 1. all components in new namespace(s) for each version > |> > > |> > ie version 1 consists of namespaces a + b, version > |> 1.1 consists of > |> > namespaces c + d; or version 1 consists of namespace > |> a, version 1.1 > |> > consists of namespace b. > |> > |> I find it ironic that version numbers are treated somewhat > |> dismissively as a versioning strategy but the rest of the document > |> turns around and uses them almost exclusively for distinguishing > |> between versions. > |> > |> This suggests to me that perhaps version numbers are a workable > |> strategy. > | > | I know, I know, I know. But how in normal text can I > easily identify > | versions? Should I say "The first version consists of > namespaces a + > | b, the 2nd version consists of namespaces c + d" > | ? > | > | But changing from "1" to "First" seems like sophistry to me. > > I'd play it the other way around, the fact that version > numbers are clearly useful and natural suggests that perhaps > they shouldn't be treated so dismissively as a strategy. I think they are commonly used, but also commonly misused. I am finding that I'm starting to realize that the use of version #s means that more interesting things are possible wrt forwards compatibility than namespaces BUT it's rare to seen anybody utilize version #s really well. I think that the point is becoming well taken, that version #s with XML aren't just a crazy strategy with namespaces available. If we can provide some clear guidance on how to use them, that would be wonderful. > > |> More importantly, is there really anything important to be > said about > |> the difference between versioning changes made by the original > |> authors and changes made by a third party. > | > | That is a huge point with namespaces, and currently we make > no use of > | the same domain for namespace names in any versioning work. > > If you're making a proposal that > > http://nwalsh.com/ns/name-extension > > is more different from > > http://www.example.com/ns/name > > than > > http://www.example.com/ns/name-extension > > And that applications might treat the extensions differently > because the domain name is or is not the same, I think you > need to expand on this quite a bit. That's a fairly > substantial and radical proposal. I think it's a very interesting possibility to use the first part of a namespace name and do pattern matching to determine who's doing the extension. I think it could be very appropriate for W3C specificiations. One example that I have in mind is WS-Policy. Currently, the extensibility model roughly says that any unknown extension is treated as a Policy Assertion. This means it is subject to the Policy normalization rules. A more sophisticated way could say that any unknown extension not defined with a base of http://www.w3.org/ns/ws-policy is treated as a Policy Assertion, and any unknown extension defined with a base of http://www.w3.org/ns/ws-policy is a Policy Language extension and is not treated as a Policy Assertion. But, I know of no places that do that so I don't want to go into too much detail in the tag finding. <snip/> > |> > The last two examples show that the middle is now a > |> mandatory part of the > |> > name. This is indicated by just the version number or a > |> new namespace plus > |> > version number. > |> > |> How does the change from version "1.0" to version "2.0" > |> indicate that the middle is now mandatory? I don't get that at all. > | > | Right, good point. How about "The last two examples use a major > | version number change to show that the middle is now a > mandatory part of the > | name. This is indicated by just the version number or a > new namespace > | plus version number." > > Well, the problem is that the version number doesn't show > anything at all. I think this needs to be turned around. Perhaps: > > In the last example, the version number has been changed from > 1.0 to 2.0. Incrementing the major part of a version number > often indicates a degree of backwards incompatible change. In > this case, perhaps it indicates that the middle name is now > mandator where it had previously been optional. Done as "In the last two example, the version number has been changed from 1.0 to 2.0. Incrementing the major part of a version number often indicates an incompatible change. In this case, perhaps it indicates that the middle name is now mandatory where it had previously been optional." <snip/> > | > |> > If the language designer has also > |> > allowed for forwards compatibility, then the forwards > |> compatibility rule > |> > must be over-ridden > |> > > |> > Good Practice > |> > > |> > Provide Forwards Compatibility Override Rule: Languages > |> with forwards > |> > compatibility support SHOULD provide an override for > indicating > |> > incompatible extensions. > |> > |> I'm not sure I believe this good practice. As I recall, Roy argued > |> pretty strongly and persuasively against it. > |> > | > | How about I change the SHOULD to MAY? > > Well, that's certainly ok, but it weakens the good practice > to the point where it becomes dubious to call it out > specifically as a good practice. > > I think we should probably attempt to wrestle this one to the > ground and see if we really have community support that it is > a good practice. How about making it a sentence, removing the GPN, and adding a condition. Something like: "Languages with forwards compatibility support MAY provide an override for indicating incompatible extensions but should only do so IF the incompatible extensions can be clearly targeted or scoped". > > |> [...] > |> > Example 7: Using SOAP Must Understand > |> > > |> > <soap:envelope> > |> > <soap:body> > |> > <personName xmlns="http://www.example.org/name/1"> > |> > <given>Dave</given> > |> > <family>Orchard</family> > |> > </personName> > |> > </soap:body> > |> > </soap:envelope> > |> > > |> > <soap:envelope> > |> > <soap:header> > |> > <midns:middle xmlns:midns="http://www.example.org/name/mid/1" > |> > soap:mustUnderstand="true"> > |> > Bryce > |> > </midns:middle> > |> > </soap:header> > |> > <soap:body> > |> > <personName xmlns="http://www.example.org/name/1"> > |> > <given>Dave</given> > |> > <family>Orchard</family> > |> > </personName> > |> > </soap:body> > |> > </soap:envelope> > |> > |> I imagine that midns:middle header is designed to make > sure that the > |> middle name will be understood. Is it then intentional and/or > |> significant that the body doesn't contain a middle? > |> > | > | I added "Use of a SOAP header for an extension may be > because the body > | was not designed to be extensible, or because the extension is > | considered semantically separate from the body and will > typically be > | processed differently than the body." > > That still leaves my question: is it intentional and/or > significant that the body doesn't contain a midns:middle > after you've gone to the trouble of making sure the consumer > will understand it? If the example would not be correct > and/or more clear if the personName in the soap:body > contained a midns:middle, then I'm missing something > significant about the example. It is intentional. There are two reasons why: 1) the personName might not have been extensible, regardless of MustUnderstand 2) the personName may have been extensible, but personName didn't support applying a mustUnderstand flag. The case of where personName is extensible and has a mustUnderstand flag is shown in the other mustUnderstand example. > > |> The implicit focus of the document is clearly XML versioning > |> strategies in a W3C XML Schema-based, web-services style > environment. > |> I appreciate that that is a large and significant environment. But > |> it's not the only environment and I don't think that the > document is > |> as explicit as it could be about its scope. > | > | What limits the document to web-services style environment? > I think > | this document completely applies to any XML-Schema based > environment, > | like a Yahoo Search API that uses Schema. Or when you say > | "web-services style environment", do you mean roughly what > we called > | "open systems" in part 1? It is definitely about systems that are > | under more than one adminstrative domain and attempts to > help authors > | avoid that one in Deutsch's 8 fallacies. > > What I mean is that there seems to be a bias towards systems that are > (1) using XML Schema for describing constraints (2) > constructing "typed object graphs" as a mechanism for > representing XML documents and (3) aborting processing unless > full validity is obtained. > > There are clearly other strategies. At the other end of the > spectrum is the HTML model where the browser accepts just > about anything and does its best. In the middle are systems > like the one I use every day for formatting DocBook documents > where failure to validate may produce distinctive error > output but it doesn't prevent the user from pressing on if > they really insist. Hmm. The first part of the document is non-XML Schema, says little except a description about types, and says nothing about Compatible extensions. Now do you think that up to Section 6, it's still missing the mark? Obviously, starting in section 6 it's XML Schema all the way but maybe we can figure out how to make 1-5 more reflective of what you'd like. Cheers, Dave
Received on Tuesday, 12 June 2007 21:38:45 UTC