RE: on "Versioning XML Languages" from David Orchard on 2003-10-23 (www-tag@w3.org from October 2003)

From: David Orchard <dorchard@bea.com>
Date: Thu, 23 Oct 2003 11:34:07 -0700
To: "'Dare Obasanjo'" <dareo@microsoft.com>, "'Dan Connolly'" <connolly@w3.org>, <www-tag@w3.org>
Message-ID: <006001c39994$3fb85380$fe2b000a@beasys.com>
I'd like to offer the way I'd like this to play out:
1. The TAG produces a finding on extensibility/versioning that is schema
neutral
2. The TAG produces a finding or a Note on how to achieve said extensibility
in xml schema.  XML Schema is after all the annointed schema language for
XML by the W3C.  And customers/specs are using it right now.
3. Some group of people (myself/TAG as a body/interested parties) comment on
what XML Schema could do in v1.1 to make extensibility and versioning
easier.  

If we don't do step #2, then it is difficult to offer #3.  Now I happen to
believe that there are some design flaws in xml schema, such as
non-determinism requirement, the complexity of specifying an open content
model, the intersection of an open content model and type
extensibility/refinement and substitution groups.  But it seems logical to
go through the previous process to identify and help improve the result.  I
very much want to get to step #3 as well.

Further, not providing #2 leaves me extremely uncomfortable because it
effectively says "we (the w3c) have provided an xml schema language and a
finding on extensibility/versioning but no relationship between the two".
That just doesn't pass muster for me.  I'm fully prepared for #2 to have a
preface saying something like "This finding/note shows techniques that are
more difficult to use than we would like.  For now, it's the best we can do.
We are working to make this easier".  And heck, maybe the relaxNG, or other,
folks can produce their own version of #2 that says "and look how much
easier it is to accomplish #1 in our language".  

Again, I'm not pleased with the complexity of the schema techniques.  The
finding originally had a rather lengthy discussion about what schema could
do differently but Norm wisely made us take it out.  I plan on having that
material appear in a separate forum, hopefully soon.  

cheers,
Dave

> -----Original Message-----
> From: www-tag-request@w3.org 
> [mailto:www-tag-request@w3.org]On Behalf Of
> Dare Obasanjo
> Sent: Thursday, October 23, 2003 9:15 AM
> To: Dan Connolly; www-tag@w3.org
> Subject: RE: on "Versioning XML Languages"
> 
> 
> 
> I'd also want to add my voice to the request that the XML 
> Schema specific bits be removed from any documents on 
> versioning XML vocabularies produced by the TAG. The main 
> problem I have with doing this is that W3C XML Schema does 
> not directly support the most common way XML vocabularies are 
> versioned in the wild (change the version attribute & stick 
> in more elements & attributes) without seemingly absurd 
> contortions and limitations. 
>  
> I'd rather not see a TAG document on versioning contain 
> rationalizations of what are basically design flaws in W3C 
> XML Schema nor would it make sense to promote them as good 
> architecture for the WWW. 
> 
> ________________________________
> 
> From: www-tag-request@w3.org on behalf of Dan Connolly
> Sent: Fri 10/17/2003 5:44 AM
> To: www-tag@w3.org
> Subject: on "Versioning XML Languages"
> 
> 
> 
> 
> My comments on
> 
> http://www.w3.org/2001/tag/doc/versioning.html
>  of 18 Sep 2003
> 
> Summary: The enumeration of strategies in section 2 Versioning
> Strategies is good and important stuff, but the thesis of
> the finding is either buried or off the mark, the boxed
> points are insufficiently justified, and it needs a
> lot of editorial work (terminology is loose, items
> in references section aren't cited in the body, etc.).
> 
> caveat: I didn't read the whole thing; I sorta lost
> the story line around section 5 or 6. I could stand
> for the XML Schema specific bits to be split out
> into a separate document.
> 
> Comments as I read it:
> 
> |XML is designed for the creation of tag sets, languages of 
> elements and
> attributes.
> 
> The term "tag set" is introduced here but not used elsewhere.
> 
> suggest: XML, Extensible Markup Language, provides common constructs,
> elements and attributes, for use in a large and growing variety of
> languages.
> 
> | XML is self-describing in the minimal sense that any XML parser can
> recognize the namespaced elements and attributes, attribute 
> values, and
> text content of a document.
> 
> Hmm... I don't see how that makes it self-describing. suggest
> striking that.
> 
> XML documents that include DTDs are self-describing in that the DTD
> part describes the other part. XML documents can participate
> in a self-describing Web by way of namespace names and namespace
> documents (I prefer the term "grounded in the web" to
> "self-describing" for that sense anyway). I suppose you could
> say that <partnum>43</partnum> is somewhat self-descriptive,
> but only to an agent that has some prior understanding of
> the term "partnum".
> 
> | It is designed for the combination of languages in instance 
> documents.
> 
> suggest: Its self-similar syntax supports documents composed from
> multiple languages.
> 
> | This paper discusses how developers can design with 
> extensibility and
> change in mind, making backward-compatible and forward-compatible
> changes possible in the future.
> 
> "This paper discusses..." -- now *that's* self-describing. 
> Too meta for
> my tastes. Do we need this sort of preface material in addition to
> the abstract? I think not.
> 
> Hmm... right into 1.1 Terminology without actually establishing the
> thesis of the finding. Surely the thesis is something about evolution
> of XML languages being a necessary part of the continuous evolution
> of the Web, yes? Perhaps you make that point later...
> 
> I'm skipping the 1.1 Terminology section; I have a long-standing
> distaste for the definitions-up-front style ala ISO specs.
> I much prefer definitions in context; collect them in a glossary
> at the end if you like, but don't make me slog thru them
> before they're motivated. "Component" is defined in this
> section but not used elsewhere. What's up with that?
> 
> Hmm... here's a candidate for the thesis:
> 
>   The primary motivation to allow instances of a language to be
>   extended is to decentralize the task of designing, maintaining,
>   and implementing extensions. It allows senders to change the
>   instances without going through a centralized authority.
> 
> But I'm not sure that's the main point to be made about language
> evolution. I think the main point is that agents in the web
> come and go, with varying capabilities. New language features
> appear to express new capabilities and concepts, but old
> agents don't go away... at least not right away.
> 
> Are "instances of a language" really extended? I suppose
> by "instances of a langauge" you mean documents. Documents
> don't change; they're like numbers. The number 4 never
> changes. Well... there's another sense of the word "document"
> ala file (more generally: resource), which does have state,
> but I don't think that's what you're talking about here.
> I think you're using "instance" as a synonym (or specialization)
> of 'representation'.
> 
> A thought: an extensible language is one with some syntax
> reserved for future use. To extend a language is to
> say what some of the reserved parts of the syntax mean.
> 
> Hmm... 1.4 Why Do Languages Change? seems to miss the main
> point too. The main reason languages change is that the
> agents that use them change, and the things that languages
> are used to talk about (i.e. life, the world, business,
> poetry, media, data, research, etc.) change.
> 
> | Using QNames to identify words in the WordNet database, for example,
> or the names of functions and operators in XPath2 are 
> examples of "just
> name" languages.
> 
> How so? WordNet is a rich structure of generalizations and
> specialiations, not just a list of names. The functions and
> operators in XPath2 are surely more than just a list of names;
> the names are connected to a data model, to datatype semantics,
> etc.
> 
> |This is by no means an exhaustive list. Nor are these categories
> completely clear cut.
> 
> Then what's the point of this section 1.6 Kinds of Languages?
> 
> |Applications are expected to behave properly
> 
> Hmm... elsewhere in the webarch doc and this finding we speak
> of "agents". Now we have "Applications". Is this separate term
> really called for?
> 
> |4.1 An Example
> |Throughout this paper, we'll motivate our discussion of 
> versioning with
> an ongoing example.
> 
> Yes, please! Please give readers the example (or at least a start)
> before asking them to slog thru defintions.
> 
> (As I said in earlier feedback, I'd like this finding to take a
> more historical approach: tell stories of what we know about
> the history of evolution of data formats, from RFC822, to
> HTML, to XML, to SOAP and so on)
> 
> | The processor must understand ...
> 
> First agent, then application, now processor. I don't see motivation
> for the distinct terms.
> 
> | Any Namespace: The language SHOULD provide for extension in any
> namespace.
> 
> I'm not sure I agree with that. I certainly don't see enough
> justification that I could convince some WG of that point
> based on this text.
> 
> This point is justified by only one example, and a hypothetical
> one at that.
> 
> Also, SHOULD/MUST/MAY are for agents in protocols. Languages don't
> do things; they just are. We don't say "numbers should be
> greater than 4". Either they are or they aren't.
> 
> "The language" seems odd too... which language? Do you mean
> "All languages"?
> 
> | Full Extensibility: All XML Elements SHOULD allow any attributes and
> allow any elements in their content models.
> 
> Again, I'm not sure I agree and I don't see enough justification
> here to convince typical WG members.
> 
> | The key value of the extension strategy described above is that
> existing XML documents can be extended without having to 
> change existing
> implementations.
> 
> Yes! There's your thesis. Well, generalize it a bit so it's
> not specific to "the strategy described above".
> 
> And, editorially, we're now up to 4 terms for the
> same concept: Agent, Application, processor, and implementation.
> 
> | Must Ignore: Receivers MUST ignore
> 
> There's #5!
> 
> I'm confused by that use of MUST. Taken out of context, it seems
> to refer to *all* receivers of any kind. But it's in a "good practice"
> box, not a "constraint" box, and it's prefaced by "For many
> applications...".
> 
> 
> 
> 
> 
> --
> Dan Connolly, W3C http://www.w3.org/People/Connolly/
> 
> 
> 
> 
>
Received on Thursday, 23 October 2003 14:34:14 UTC