- From: C. M. Sperberg-McQueen <cmsmcq@blackmesatech.com>
- Date: Mon, 12 Oct 2009 18:02:10 -0600
- To: "Costello, Roger L." <costello@mitre.org>
- Cc: "C. M. Sperberg-McQueen" <cmsmcq@blackmesatech.com>, "xmlschema-dev@w3.org" <xmlschema-dev@w3.org>
On 12 Oct 2009, at 11:00 , Costello, Roger L. wrote: > > Hi Folks, > > Below are two ways to declare a <Book> element. > > Both versions use <all>, to permit the elements within <Book> to > occur in any order. > > The first version uses an unbounded <any>. The second version uses > interleaved open content. > > Are these two versions identical? You may mean "Do they accept the same inputs as valid instances of element Book?" Yes, I think they do. > If so, is there an advantage of one over the other? Some people may find one formulation clearer or simpler than the other; they will rightly prefer the one they find clearer. I expect different people will have different preferences, depending on their tastes. If a schema is designed so that ever complex type, or most of them, has a particular form of open content, then the open content can be defaulted at the schema document level, which will make most content models shorter and simpler. Readers who forget that the schema document supplies default open content may be surprised and complain about action at a distance. In the case of all-groups, using an explicit wildcard and using interleave open content are roughly similar in complexity of the declaration. In other cases, explicit wildcards are much more verbose and for many schema authors rather error-prone. See http://www.w3.org/TR/xmlschema-guide2versioning/ for examples. > If they are not identical, how do they differ? You may mean "Do they produce indistinguishable PSVIs?" No, not quite; the [match information] property in the PSVI allows the two to be distinguished, for elements other than Author, Title, Date, ISBN, and Publisher: in the one case, those elements will have [match information] = 'lax' (since they match a lax wildcard in the content model), and in the other they will have [match information] = 'open' (since they match open content). This allows the downstream application to distinguish the two cases, if it wishes to. In the usual case, making downstream processing depend on such a subtle distinction is probably not a good idea, but YMMV and there may be special circumstances. Certainly there are some designers who like the idea of being able to say that if an element in the input matches an element in the version N content model, then a version N processor is obligated to process it in a certain way, and if the element in the input matches a wildcard (or: matches only open content) in the version N content model, then a version N processor is obligated to tolerate it, or to ignore it, or to handle it in some other way. In such a design, "matches an element" and "matches a wildcard" are syntax-level signals for different kinds of processing. The signal "matches an open-content wildcard" can fit nicely into this pattern, either to signal a third kind of processing or to shift the distinction from element-vs-wildcard to content-model-vs-opencontent. HTH -- **************************************************************** * C. M. Sperberg-McQueen, Black Mesa Technologies LLC * http://www.blackmesatech.com * http://cmsmcq.com/mib * http://balisage.net ****************************************************************
Received on Tuesday, 13 October 2009 00:02:41 UTC