- From: C. M. Sperberg-McQueen <cmsmcq@blackmesatech.com>
- Date: Tue, 14 Apr 2009 13:01:25 -0600
- To: John Arwe <johnarwe@us.ibm.com>, www-xml-schema-comments@w3.org
- Cc: "C. M. Sperberg-McQueen" <cmsmcq@blackmesatech.com>
On 2 September 2008, John Arwe filed bug 6014 (http://www.w3.org/Bugs/Public/show_bug.cgi?id=6014). I apologize for the length of time it has taken to draft this response. I'm sending it as email to the originator and to the XML Schema comments list, to avoid overburdening Bugzilla with it. I hope this doesn't inconvenience anyone. First, thank you again for your reading and comments. > 1.5 Documentation Conventions and Terminology - deprecated > from: although some processors may choose to issue > to : although some processors MAY choose to issue Thanks. > 3.12.3 Constraints on XML Representations of Type Alternatives > "No <alternative> element may have more than one of these, and > each must have at least one of these. " > No...MAY seems destined to be mis-read. Feels like it wants to be a > MUST (have at most one), prefer this, or MUST NOT (have >1) Recast. > 4.2.1 Conditional inclusion > "Where they appear, the attributes vc:minVersion and > vc:maxVersion are treated ... then the element on which the > attribute appears is to be ignored" > Does anyone really think describing things in terms of what it's not > (i.e. negatively) is better than positively? Not as a general principle, no. But in this case, every positive formulation I have come up with seems to me clunkier than the negative formulation in the spec. (That is, reading your comment I thought "Oh, wow, that really needs to be fixed!" and turned to the spec to recast the paragraph. But faced with the text, my good intentions dried up. It may be to do with the fact that the normal action when reading a document is to act on what it contains; *excluding* elements is the marked case here, and so it's the one described. > Realizing where you are in the process, since I think this IS > correct as stated (though it took several passes for me to catch the > not's and un-'s I missed the first time), I'd settle for a > non-normative summary stated in the "include if" positive sense. OK, that I think I can do. I've proposed that we add The effect is that portions of the schema document marked with vc:minVersion and/or vc:maxVersion are retained if vc:minVersion <= V < vc:maxVersion. > 4.2.2 Assembling a schema <include> clause 1 > "It is not an error for the ·actual value· of the schemaLocation > [attribute] to fail to resolve at all, in which case the > corresponding inclusion must not be performed." > why maintain this unpleasant dark corner, if redefine and override > etc all want to mandate resolution? The original motivation was quite simple. As Andrew Layman, then the Microsoft rep, argued, you really don't want to say that your document is valid now (at 11:39 a.m.) because the network is up, but becomes invalid (without having changed) at 11:46 because the router goes down and the schema for one of the imported namespaces can't be dereferenced. Redefine seems to be a different case because in order to work correctly it can need a fairly detailed knowledge of the internals of the schema being overridden; flagging failure to resolve as an error seems like a better choice there. The other operations present other mixes of reasons to want resolution failure to be an error and reasons to want it not to be an error; the 1.0 spec made its best effort to decide them on a case by case basis. XSD 1.1 seems to me (speaking solely for myself here) very unlikely to change in this area, for three reasons: (1) Compatibility with 1.0 is recognized as a desideratum, to be overridden only in the interests of even more desirable goals. I don't see making resolution failure on import be an error as such an overriding goal here. (Complete cleanup of schema composition and all its dark corners might be such a goal, so I don't rule it out completely, but this change by itself is not worth an incompatibility.) (2) If we did want to make the treatment of resolution failure more parallel, making existing legal schemas illegal is the more painful way to achieve it; it would be easier to sell if the only change was to eliminate some current error messages. (3) Quixotic though it may be, I still like the original rationale for allowing the resolution to fail. If a user wants a different behavior from a processor, a processor can offer a 'fail on import resolution failure' mode without being non-conforming -- processors have wide latitude here. So for people who want the other decision, it can be treated as a processor quality of service issue. (4) The idea of changing this behavior may well have some support in the WG. But counting noses, I don't see the idea generating consensus. And if there is no consensus to change the status quo, then the status quo remains. > is this just FUD masquerading as backward compatibility, or are > there well-known concrete scenarios that depend on this behavior? There is plenty of FUD to go around, and this may be another instance. I'll let you decide whether you are persuaded by the argument that validity should not depend on the reliability of the router. > 4.2.4 Overriding component definitions > Schema Representation Constraint: Override Constraints and Semantics > clause 4.1.1, 4.2.1 > "Let D2' be a <schema> information item obtained by . Then ..." ??? > something missing 'by .' > 4.2.4 Overriding component definitions Schema Representation > Constraint: Override Constraints and Semantics clause 4.1.2 > "Note: One effect of the rule just given..." I cannot tell, due to > previous comment's missing text I think this was a markup snafu in the draft of 20 June 2008, owing to a clerical error (mea culpa). The Last Call draft of 30 January has the missing text: 4.1.1 Let D2′ be a <schema> information item obtained by performing on D2 the transformation specified in Transformation for xs:override (§G.2). Then D2′ corresponds to a conforming schema (call it S2). Thank you for catching this. > 4.2.4 Overriding component definitions > Schema Representation Constraint: Override Constraints and Semantics > clause 4.1.2 > "Note: Another effect is..." You handle A <override> B <override> C > It's not clear if there is deterministic behavior in the Y case: > both B and C <override> A with conflicting specifications. The intent is to have deterministic behavior -- at least, to the extent that the existing spec, without 'override', has it. In this case, if I have understood the example I think the result is clear. Assume schema document A defines an element E with type xsd:anyURI, and schema documents B and C redefine that element as having type xsd:boolean and xsd:decimal respectively (no built-in type with an initial 'C', sorry, 'deCimal' is the best I can do). I am creating a document starting from schema document D, which does nothing but include schema documents B and C. (First question: is this your scenario?) Then the effect of the transformations prescribed by the spec are that B corresponds to the same schema as B', where B' is just like B except that instead of overriding A it includes A', a schema document just like A except that the definition of E <element name="E" type="anyURI"/> is replaced by the one specified in the override of A by B: <element name="E" type="boolean"/> Analogously, C generates the same schema by overriding A as it would be including A'', which contains <element name="E" type="decimal"/> The effect on D is straightforward: by including B, it acquires a top-level element E of type boolean, and by including C, it acquires a top-level E of type decimal. This pair violates the rule that there can only be one top-level E. Blammo. Er, I mean, not a legal schema. (Sorry, I'm saying top-level not global, because I haven't internalized the rule that says 'top-level' just applies to the XML; I think of it as a synonym for 'global'.) > <include> clause 3.1.2 in particular, if 2.1 fires, seems to tell me > I cannot "build up" a ns with components from several > non-overlapping <schema> items...since I think this is possible > today, not convinced I'm reading it right. It shouldn't be telling you anything of the kind. The 'with the possible exception of the schema component' is just to avoid a problem with the previous wording. The spec used to say that if A included B, then the schema for A included all the components of the schema for B. But in the simple case (where B does not also include A), the schema for B includes not just elements and types and so on, but also a schema-description (or schema-as-a-whole) component of the kind described in section 3.17. Conclusion: the schema for A contains two schema-description components. Which is not allowed. Blammo, illegal schema. Clearly not what was intended by 1.0, clearly not what implementors have implemented -- but very clearly what the spec said. So 1.1 excludes the schema component. (The reference to a 'possible' exception covers the case where A includes B and B includes A, where a sufficiently devious reader can argue that there must be a kind of fix-point semantics going on here, and so the schema for A and the schema for B are the same schema, with the same schema-description component. And in that case, you can't say the schema for A includes all the components in the schema for B except for the schema-description component, because it does include that one, too. > It also sounds like it prescribes an order of processing, but I had > the impression that different orders were permissible (lazy > retrieval strategies) which seems to conflict with that impression. It describes logical dependencies, but not strictly speaking an order of processing, in the same way that the usual rules of arithmetic say that (a + b) * (c + d) define the dependency of the multiplication upon the sums which are its argument. The usual way to handle those dependencies is to perform the sums and then the multiplication, but if you can't do that (because you're a compiler and you know the values of c and d because they are constants, but not of a and b which are variables) you may choose instead to transform the expression into a * k + b * k where k = c + d. In the case of override, if you have A include B, A include C, B override D, C override E, E include F, you certainly have the choice of taking B and C in either order, if and only if you currently have that choice. > 5.2 Assessing Schema-Validity - strict > ..."if they do not identify any declaration or definition, then no > schema-validity assessment is performed. " > This appears to say that the result is implementation-dependent. I > rather expected some prescribed output, either an error or values > for [validation attempted] etc. Well, there is certainly a problem here. The accompanying text says the PSVI produced by lax wildcard validation and strict wildcard validation is the same -- which means "no" should change to "lax" or else the note should change. (Hmm. A one-word change or rewrite the entire note. Wonder which I'll choose?) But the analogy with strict wildcard handling is in fact exact. If the parent element E has a strict wildcard in its content model, and the child element F matches that wildcard but has no declaration, then N.B. F is not strictly validated (it can't be, we don't have a declaration) and is not marked invalid (it can't be, we haven't validated F, how can we mark F invalid?). The invalidity is in the parent E, which is supposed to have only properly declared children, but has been caught red-handed trying to smuggle F in over the border. Blammo. Document is invalid, but it's E not F. If F is the validation root of a validation in strict-wildcard mode, then by analogy it is not F but the parent of F which needs to be marked invalid. F has no parent in the validation episode; the closest we have to a parent is the calling application, which needs to be told "you expected F to be declared -- it wasn't. Here's the PSVI I got, by the way". Hence the expectation in the second paragraph of the note that the invoking process will report an error to its environment. I hope this helps. It would be nice if the spec were less dense here, but since we wish NOT to constrain how processors face the world, we don't have a lot of concrete information to go on. Even the mention of an invoking application strikes some WG members as a bit risqué. On the plus side, it's a great topic for a blog entry or several. And maybe a conference poster. The wording changes proposed in response to the comments above can be seen in context at http://www.w3.org/XML/Group/2004/06/xmlschema-1/structures.b6014.html There are four of them, and the proposal elides unchanged sections. I hope the changes, and the comments above for points on which no change in wording is proposed, will resolve the issue(s) to your satisfaction. -- **************************************************************** * C. M. Sperberg-McQueen, Black Mesa Technologies LLC * http://www.blackmesatech.com * http://cmsmcq.com/mib * http://balisage.net ****************************************************************
Received on Tuesday, 14 April 2009 19:02:07 UTC