- From: C. M. Sperberg-McQueen <cmsmcq@blackmesatech.com>
- Date: Fri, 15 Jun 2012 00:56:52 -0400
- To: "Costello, Roger L." <costello@mitre.org>
- Cc: "C. M. Sperberg-McQueen" <cmsmcq@blackmesatech.com>, "xmlschema-dev@w3.org" <xmlschema-dev@w3.org>
On Jun 14, 2012, at 2:59 PM, Costello, Roger L. wrote: > Hi Folks, > > In section 2.2.3.2 of the Structures specification it says: > > (1) A particle is a term ... consisting of either an element declaration, a wildcard or a model group ... > > (2) Term is ... any of the three kinds of components that can appear in particles. A slightly different choice of elision may make the distinction being drawn clearer: "A particle is a term in the grammar ..., together with occurrence constraints." "Term is used to refer to any of the three kinds of components which can appear in particles." > Huh? > > A particle is a term which is a particle which is a term which is ... Not quite. A particle is a term plus occurrence constraints. A term is the part of the particle that is not the occurrence constraints: an element declaration, a wildcard, or a model group. Consider a slightly different case. In regular expressions, the expressions a+ and a* differ in one way (+ vs *), but are the same in one way (a vs a). The expressions a+ and b+ also differ in one way and are the same in one way. If you are going to talk about the structure of regular expressions, and make rules about their structure and meaning, you may find you need terms to distinguish the expression a+ from its two distinct component parts. Similarly for bits of content models. If a content model says the content of an element can contain one or more 'a' elements, it will sometimes be helpful to be able to distinguish cleanly between the reference to the 'a' element and the reference to 'one or more of those'. In XSD, the larger expression (the analog of a+) is called a 'particle', and the basic part of the particle, to which the occurrence constraints expressed by minOccurs and maxOccurs apply, is called the 'term' of the particle. The choice of names for these concepts is inevitably a little arbitrary. > ... > > This is even more interesting: > > (3) A basic term is an Element Declaration or a Wildcard. So, any term that is not a model group. > > (4) A basic particle is a Particle whose term is a basic term. > > Huh? > > Let's examine (4) shall we? A basic particle is a Particle (which according to (1) is a term) whose term (which according to (2) is a particle) is a basic term. > > This is just gibberish as far as I can tell. If you have followed what I said above, it should be clear that it's not gibberish but a straightforward statement. Particles consist of terms plus occurrence constraints, and one classification of particles (basic vs non-basic) depends on the nature of the particle's term (also basic vs. non-basic). > > Why can't this stuff be written in a simple, concise manner? In the case of this text? Sheer human fallibility on the part of the editors. Sorry about that. Send any WG chair an infallible superhuman editor and they'll thank you for it. But until you can secure a reliable supply of them, WG chairs are stuck appointing editors who will screw things up. In this case, two editors (at least) screwed up here, one by drafting sentences which leave some smart readers high and dry, and the other by failing to see the problem and revise the sentences to make them give better guidance to the reader. Since I was one of those editors, I'll say yes, you're right, this should be clearer. If a careful reader is confused, then the text should work harder to throw that reader a lifeline. But I'll also confess (human fallibility again) that until you provided concrete evidence that these sentences have confused a careful reader, I thought they were perfectly clear. And it's still not clear to me how to redraft them so the new sentences do a better job than the existing text. > Why introduce terminology that has no apparent benefit? Naming things is an important step in making them easier to talk about, think about, and understand; introducing terminology for important concepts in a spec is one of the most important functions the text of any spec performs. Good terminological choices help the WG responsible for maintaining a spec, as well as readers of the spec, to clarify their thoughts. I think that sometimes it may be hard to perceive the advantage of even good terminological choices at first glance; in those cases, the answer to your question is: because it may have benefits that are not apparent. In other cases, the apparent lack of benefit is due to a failure of understanding on the part of the reader. There are a lot of rules in XSD that apply to terms and there are different rules that apply to the particles which enclose those terms. Trying to express those rules without the terms "particle" and "term" would not make the spec any clearer; there are plenty of examples in the XSD spec where the introduction of suitable terminology would simplify the text dramatically. > > Why are terms used before they are defined? For example, the term "particle" is used in 2.2.1.3 but isn't defined until 2.2.3.2 and model group is used in 2.2.1.3 but isn't defined until 2.2.3.1. In the general case, I think the reason is that it's not always possible to sequence definitions of complex sets of terms so that no definition or discussion appeals to any terms defined later in the sequence. Here, both of the terms you mention are first introduced at the beginning of section 2.2 as components of XSD schemas, so the reader of section 2.2.1.3 can plausibly be expected to know that "particle" denotes a kind of schema component, even if the reader may not yet know much about what kind of component a particle is. And that level of understanding really ought to suffice for understanding what 2.2.1.3 says about particles. > > Why is this specification 380 pages long? That one's easy. Because the 1.0 WG did not have the time it would have taken to make it shorter. And the 1.1 WG was unable, for reasons of backward compatibility, to perform the kind of conceptual simplification that would be necessary to make the text shorter; instead, the editorial revisions in 1.1 mostly took the form of simplified syntax, some modest refactoring of the prose, and the addition of explanatory material which made the text even longer. Shorter specs are nicer, when they are possible and when they do the job. Sometimes, however, the time comes when it seems necessary to ship an imperfect spec rather than delay it further. -- **************************************************************** * C. M. Sperberg-McQueen, Black Mesa Technologies LLC * http://www.blackmesatech.com * http://cmsmcq.com/mib * http://balisage.net ****************************************************************
Received on Friday, 15 June 2012 04:57:19 UTC