- From: C. M. Sperberg-McQueen <cmsmcq@blackmesatech.com>
- Date: Thu, 4 Jun 2009 17:33:17 -0600
- To: "Costello, Roger L." <costello@mitre.org>
- Cc: "C. M. Sperberg-McQueen" <cmsmcq@blackmesatech.com>, "'xmlschema-dev@w3.org'" <xmlschema-dev@w3.org>
On 4 Jun 2009, at 13:49 , Costello, Roger L. wrote: > Hi Folks, > Consider this schema, which uses <defaultOpenContent> to make the > entire schema open: [... example snipped ...] > Can I add extension elements before and after the root element > (BookStore)? Various answers to this are possible (including 'yes' and 'no' and 'maybe'); which one applies depends on aspects of the validation episode you haven't specified. > Is this instance document legal (I have wrapped the root element > with an extension element): [... example snipped ...] I'm sorry if this sounds pedantic, but "legality" isn't a property or term defined by the XSD spec. To the extent that your question can be paraphrased as "does this document abide by the agreement between sender and receiver?", the answer is: it depends. You haven't told us what that agreement is. We may assume that it involves the use of the schema document you describe, and that it wants the [validity] property of the validation root not to be 'invalid', but (a) neither of those is necessarily the case and (b) by themselves they don't suffice to determine an answer. I think you mean "Does the presence of an xsd:defaultOpenContent element in the schema document have any effect on the validity conditions of the parent, if any, of r:BookStore?" The answer is no. The defaultOpenContent element in the schema document determines how the types defined in this schema document behave. It does not affect types defined in other schema documents, and it does not affect xsd:anyType. Readers primarily interested in the effect of defaultOpenContent can stop reading now. The rest of this note is an explanation of why the question "is this instance document legal" can almost never be answered without more information than provided in this case, and what factors may determine the answer. .... Any conforming schema document defines some schema components, which can be used in validation. But a collection of components is not by itself enough to determine the result of validation, any more than a DTD file is by itself enough to determine the result of validation using DTDs. In order to validate the instance document you give, the person or agent invoking the validator needs to specify: - What schema is to be used? Is it the schema corresponding to the schema document you specified, with no additional components? Or might the schema document you specified be combined with another one which provides a definition for r:MyFavoriteBookStore? If the schema used for validation contains an element declaration for r:MyFavoriteBookStore, then whether that element is valid against that declaration or not depends on what that declaration says. The open content specified in your schema document affects the type definitions given in your schema document, not others. Given a particular schema, whether the document is legal (satisfies the agreement between sender and recipient) or not depends on what validation assessment you request the validator to perform, and on details of the agreement. - Where does validation start? At the document root? At element //Date[2] ? somewhere else? There are lots of possibilities. - Which validation mode is used to start? Possible answers include . element-driven: the invoker specifies an element declaration in the schema and the validation root is validated against that declaration. . type-driven: the invoker specifies a type definition in the schema and the validation root is validated against that declaration. . lax wildcard validation: the invoker doesn't specify a declaration or definition. Instead, the validator looks for a top-level element declaration (and possibly, if the validation root has an xsi:type attribute, for a top-level type definition), uses what it finds, and doesn't complain if it doesn't find one. . strict wildcard validation: like lax validation, but if no element declaration or type definition is found for the validation root, there's a problem. Not all validators provide options for all of these possibilities; in the extreme case, the invoker specifies a particular schema and a particular mode for starting validation by using a particular processor that only supports one way of constructing the schema (or only one schema) and only one invocation mode. If you want the choice to lie with you and not with your software, examine the functionality offered by your validator. The XSD spec allows you a great deal of flexibility in defining what classes of document you want to accept or reject, that is, in saying what you want to be legal. The flip side of that is that you are responsible for saying what you mean. If you want the instance to be legal, you can certainly specify a schema-based agreement between sender and recipient that makes it legal. There are several ways to achieve that result: which one you choose depends on *why* you want it to be legal. Similarly, if you want the instance to be illegal, you can achieve that, too, and again your choice of methods depends on why you want it to be illegal. So, to illustrate what I said about the possible answers to your question. (1) Suppose the agreement between sender and receiver is that (a) validation starts at the document's root element in (b) element-driven mode, with the element declaration /schemaElement::r:Bookstore, and (c) in the PSVI, the validation root should have [validity]=valid. Result: not legal. The r:MyFavoriteBookStore element doesn't match the prescribed element declaration, so it isn't valid. (2) Suppose the agreement between sender and receiver is that (a) validation starts at the first r:Bookstore element in the document, in (b) element-driven mode, with the element declaration /schemaElement::r:Bookstore, and (c) in the PSVI, the result should have [validity]=valid and [validation attempted] = partial or full. Result: legal (I think; I just eyeballed the instance and schema and didn't see anything wrong beyond the use of xsd:string for what is apparently intended to be natural-language data, which is a poor design choice but doesn't make the document invalid). (3) Suppose the agreement between sender and receiver is that (a) validation starts at the document root, in (b) type-driven mode, with the type definition /schemaElement::r:Bookstore/type::*, and (c) in the PSVI, the result should have [validity]=valid or [validity]=unKnown and [validation attempted] = partial or full. Result: not legal. The type requires at least one r:Book element, but the only child is named r:BookStore. (4) Suppose the agreement between sender and receiver is that (a) validation starts at the document root, in (b) type-driven mode, with the type definition /type::xsd:anyType, and (c) in the PSVI, the result should have [validity]=valid or [validity]=unKnown and [validation attempted] = partial or full. Result: legal. (5) Suppose we specify (a) strict wildcard mode, and (b) the resulting PSVI has [validity]=valid on the validation root. Result: not legal. The schema you describe has no element declaration for r:MyFavoriteBookStore, so the PSVI has [validity]=notKnown. (6) Suppose we specify (a) strict wildcard mode, and (b) the resulting PSVI has [validity]=valid on the validation root. Result: not legal. The schema you describe has no element declaration for r:MyFavoriteBookStore, so the PSVI has [validity]=notKnown. (7) Suppose we specify (a) strict wildcard mode, and (b) the resulting PSVI has [validity]=valid or [validity]=notKnown (i.e. does NOT have [validity]=invalid) on the validation root. Result: legal. The schema you describe has no element declaration for r:MyFavoriteBookStore, so it's laxly assessed and the PSVI has [validity]=notKnown. Note that the document would be legal even if you replaced the third r:Book element with <Book>This is not a legal book: it has character data where it shouldn't, and it lacks the required Title, Author, Date, ISBN, and Publisher children.</Book> When the validation root is laxly assessed, it is never invalid: invalidity bubbles up from child to parent only for elements with declarations. (8) Suppose we specify (a) strict wildcard mode, and (b) the resulting PSVI has [validity]=valid or [validity]=notKnown (i.e. does NOT have [validity]=invalid) on every element and attribute (i.e. there are no invalid elements or attributes). Result: legal. The schema you describe has no element declaration for r:MyFavoriteBookStore, so it's laxly assessed and the PSVI has [validity]=notKnown. The r:BookStore child is valid, as are all of its children. And if they weren't because you replaced the third r:Book element with the invalid element given above, the document would not be legal, because the r:BookStore element would be invalid. I hope this helps. -- **************************************************************** * C. M. Sperberg-McQueen, Black Mesa Technologies LLC * http://www.blackmesatech.com * http://cmsmcq.com/mib * http://balisage.net ****************************************************************
Received on Thursday, 4 June 2009 23:34:05 UTC